Back to Blog

Agentic Programming Languages

AI Agents LLMs Programming Languages Prompt Engineering
Agentic Programming Languages

History.md

Knowledge is at the core of human progress. The narrow view of knowledge has at times stifled progress in history. In a sense, this can be the eternal tension between scientists and engineers. On the one hand, we have the scientist who places paramount value on "factual knowledge" that is usually declarative. For example, mathematicians use a formal framework consisting of standard language to derive more complex declarative knowledge. The theoretical sciences also create systems of factual knowledge. On the other hand, we have engineers who can borrow the language of science but still exercise procedural knowledge, or "how to," which is much looser and varies from person to person. This is, however, just a cycle!

In ancient Egypt and Mesopotamia, land surveying and architecture were highly secretive, procedural crafts. "Rope stretchers" knew how to create perfect right angles and structural foundations using knotted ropes and physical intuition. This is also called tacit knowledge. Around 300 BC, Euclid wrote The Elements. He took the builders' physical and procedural knowledge and codified it into pure declarative axioms and theorems. He captured the "how" and turned it into an absolute "what".

Once Euclidean geometry was published and taught worldwide, the knowledge of "how to make a right angle" lost its premium value. It became standard. The value then shifted back to new procedures: how to use this foundational geometric bedrock to engineer aqueducts, domes, and ballistics, which were the procedural engineering of the Roman era.

What we today call the "Enlightenment" is a period of another revival of procedural knowledge. During the Middle Ages, knowledge was locked inside the guild system. Masons, metallurgists, and apothecaries relied entirely on tacit, procedural knowledge that was passed down through decades of physical apprenticeship and trial and error. You couldn't read about steel tempering; you just had to do it in a workshop.

Diderot's Encyclopédie, the Enlightenment, and the printing press forced this procedural knowledge into the open. Treatises on metallurgy, agriculture, and physics dissected the artisan's craft into declarative formulas, measurements, and laws that finally led to new sciences of thermodynamics and later Quantum Mechanics.

In this essay, we look at one way in which language shifts from procedural to declarative and again procedural. The three main pillars of American Pragmatism, Dewey, Peirce, and James, argue that pure declarative knowledge is just a necessary illusion and believing in facts comes from their practical consequences. We don't just passively observe the world; knowing is a tool for doing.

Programming.md

With the advent of agentic frameworks, programming has changed dramatically since last year. The initial optimism about "automating the whole software engineering" is starting to show cracks. Companies began rehiring many of their laid-off software engineers. What's gone wrong? Isn't writing code as easy as a sequence of prompts?

Programmers are still confused, trying to find ways to make agents useful in their workflow. Unlike before, this time it's not about lower-level coding alone. Now, programmers find themselves facing a new layer of challenges that largely stem from where they spend most of their time: the prompts.

Prompt engineering is the name we assign to this new practice. Initially, it seemed that prompt engineering was just natural language, but we slowly realized that we needed a better framework for the "coding agent" to perform better.

Design.md

The project design.md is an example of the movement toward design languages using LLMs. But to understand their approach, we first have to go back to the emergence of "design languages" themselves. In the early days of computer graphics, developers had to write procedural algorithms to draw shapes and render pixels line by line. Eventually, graphics engines emerged as a set of functions ,a kernel capable of doing the "basic work" like drawing a circle or creating a color gradient. At this point, the work shifted toward declaring the what (e.g., display: flex; background: red; in languages like CSS), and then the browser engines "render" it, figuring out the how.

In computer science, there are two main camps in programming languages. On one side, we have the familiar imperative programming languages that emphasize how to solve a problem. On the other, we have declarative programming, which emphasizes the what ,the constraints.

Most of what we mean by programming falls within the realm of imperative or procedural languages, in which we design algorithms and step-by-step processes with states that change to create the right flow from inputs to outputs.

Declarative languages, on the other hand, only describe what the output should be. In SQL, we only specify what we want to select from the data (e.g., all males older than 56 in the database table "people"), without specifying how. These languages work on a set of constraints.

Now let's return to the design.md project. Its core premise is an extension of CSS ,purely based on Normative constraints ,to broader Prose constraints, which are much looser. The design.md approach splits cognitive load into two classes:

  • The Normative (YAML): The absolute, machine-readable constants of your world. In design, this is exactly #1A1C1E or 16px.
  • The Prose (Markdown): The human-readable rationale or "loose constraints." Instead of just declaring a color, the prose can say: this is the primary color (where to use it is up to the agent).

This combination is incredibly powerful because LLMs are semantic engines. They need the strict boundaries of normative data so they don't hallucinate values, but they also need the semantic context of the prose to know how and where to apply those values.

Agents.md

The framework above goes beyond the design case. The irony of history is that most AI frameworks in the early days (during the 1970s) started with the declarative paradigm. Prolog was created by the French computer scientist Alain Colmerauer in 1972 and served as a very early "question answering" intelligent system.

To understand it, consider this Prolog example:

Prolog
mother_child(trude, sally).
father_child(tom, sally).
father_child(tom, erica).
father_child(mike, tom).

sibling(X, Y) :-
    parent_child(Z, X),
    parent_child(Z, Y),
    not(X = Y).

parent_child(X, Y) :- father_child(X, Y).
parent_child(X, Y) :- mother_child(X, Y).

This code describes a set of constraints between constants (the people: trude, sally, tom, etc.). We can then query whether sally is the sibling of erica:

Prolog ,Query
?- sibling(sally, erica).
Yes

To solve this procedurally (imperatively), we would need to write recursive path-finding. Here it's unnecessary because the engine already implements it. Declarative languages were actually very diverse: used in design, question answering, planning (Action Description Language specifies what actions need to be done, not how), and more.

Though elegant, these AI programming languages had shortcomings. We could only partially model a problem, and to complete the constraints, we had to keep adding more until it became impossible. These symbolic approaches didn't achieve broad success, except in very specific domains where the data were already in the right format.

But declarative approaches are far from dead. Recent progress such as design.md is instructive. The constraints are not as rigid as Prolog's, but not as loose as plain natural language. Instead, we have the same two classes:

  • Normative Constraints:
    • "All database queries must go through the ORM."
    • "No direct HTTP calls outside of the /services directory."
    • "Strictly enforce 3-tier architecture: Controller / Service / Repository."
    • "Always run tests after disconnecting from database."
  • Prose Constraints:
    • "We prefer functional purity in the /utils folder. Avoid mutating state."
    • "When handling user data, always prioritize explicit error handling over silent failures. The system should fail loud and fail early."

The second set is the "loose" constraints that were previously impossible to declare. Even though that's the magic part of current LLMs, we have to see the whole picture. Sooner or later, these abilities will become less impressive, and their proper use will become more important.

A Shared Schema Language

Agentic programming languages can be the next phase of programming language development ,reviving older declarative languages but making them far more useful because we can now close the loop on "uncertainties." However, we need better formatting than what's described above.

Following the YAML format of design.md, we create a set of constraints with constants. For example, if we have a set of related processes that need to be run frequently ,such as auditing code from different perspectives: software engineering correctness, business contracts, internal constraints, domain constraints, regulatory requirements, etc. ,they all need to work within the same language so that different agents' results are comparable and communicable throughout the organization.

A shared YAML schema might look like this:

YAML ,_schema.yaml
version: "1.0"

# Severity levels
# Domain-specific interpretations live in each agent's SEVERITY GUIDE prose.
# These are the normative labels and base definitions shared across all agents.

severity:
  CRITICAL:
    label: CRITICAL
    definition: "Cannot ship or use as-is. Requires immediate
    remediation before the pipeline or task is trusted."
  MAJOR:
    label: MAJOR
    definition: "Significant gap or inconsistency.
    Fix before scale; the artifact is still directionally useful."
  MINOR:
    label: MINOR
    definition: "Low-risk deviation; cosmetic, clarity,
    or an affirmative no-issue-found statement."

# Tools
# Each agent's front matter lists which subset of these tools it has access to.

tools:
  read_file:
    description: "Read a file from the project source tree by absolute path."
  list_files:
    description: "List files in a directory of the project source tree."
  list_databases:
    description: "Enumerate all MongoDB databases and their collections."
  query_collection:
    description: "Run an aggregation pipeline or filter
    query against a MongoDB collection."
  sample_documents:
    description: "Fetch N random documents from a MongoDB collection."
  web_search:

finding_required_fields:
  - dimension      # the rubric ID being reported on
  - severity       # one of: CRITICAL | MAJOR | MINOR
  - evidence       # quoted code, field name, or document value supporting the finding
  - recommendation # actionable instruction: what to change and to what

Then you have the "prose" which uses the above constants to communicate with the LLM. For a specific case ,an agent that audits biological soundness (the biologist agent) ,it becomes:

Markdown ,biologist-agent.md
---
version: "1.0"
agent: biologist
schema: "./_schema.yaml"
role: "Senior computational biologist"
scope: "biological correctness of harmonization, controlled vocabularies,
and layer models"
runtime_vars:
  - project_root

tools:
  - read_file
  - list_files
  - sample_documents
  - report_findings
  - search_web

dimensions:
  COARSE_GRAINING:          { required: true, min_findings: 1 }
  ENUM_COMPLETENESS:        { required: true, min_findings: 1 }
  CROSS_FIELD_CONSISTENCY:  { required: true, min_findings: 1 }

severity: "{severity}"
output_contract: "{output_contract}"

report_schema:
  type: BiologyReport
  required_fields:
    - files_reviewed
    - findings
    - summary
---

You are a senior computational biologist with deep expertise in:
• Receptor pharmacology (IUPHAR/BPS classification standards)
• Protein biochemistry and functional classification
  (UniProt/Swiss-Prot, InterPro, Pfam, Gene Ontology nomenclature)
  ...

## SEVERITY GUIDE

CRITICAL  a downstream user would draw a scientifically wrong conclusion.
          Examples: a species from a different family collapsed under the
          wrong genus; a mechanism classified in the wrong direction.

MAJOR     an important gap or misleading label that a biologist would
          flag in code review but may not cause an immediate error.
          Examples: a well-known target type unmapped; a receptor
          subclass not distinguished.

In prose, we use natural language with more freedom, but it's grounded in the shared constants and enriched with examples. The key insight here is that the normative part (YAML) pins down the vocabulary and data types, while the prose part (Markdown) provides the semantic reasoning context ,exactly the kind of combination LLMs excel at.

Why This Works

One of the main reasons the above framework works well with LLMs is their impressive performance in instruction-based tasks. LLMs fine-tuned on a domain with constrained instructions find it much easier to recognize patterns. For example, it has been shown that LLMs can be fine-tuned to perform much of the chemical reasoning by training on a dataset of molecules and their properties. In many cases, it's non-trivial to determine the interaction relationships between two binding molecules, which normally requires quantum chemistry computations. LLMs excel at seemingly unrelated tasks when they encounter enough examples in a constrained domain ,in chemistry, a limited set of atoms and bonds arranged in a sequence, with equally constrained outputs.

With the same token, LLMs excel when you set up a normative and prose language in which both parties ,you and the LLM ,know what they "refer to" every time a new instruction or prompt is exchanged. The shared schema becomes a mutual contract, dramatically reducing ambiguity and hallucination.