This commit is contained in:
nora 2022-06-27 12:44:38 +02:00
parent 72687d4167
commit e4f0dee8a9
6 changed files with 48 additions and 33 deletions

View file

@ -5,7 +5,7 @@
|===
|Quality Goal|Approaches
|Ease of implementation| Simple language with not too many features. Many features out of scope, like static data.
|Interpreter UX|Keeping track of source locations throughout the compilation/interpretation process, usage of the library `ariadne` (https://crates.io/crates/ariadne) for displaying diagnostics.
|Interpreter UX|Keeping track of source locations throughout the compilation/interpretation process, usage of the crate `ariadne` (https://crates.io/crates/ariadne) for displaying diagnostics.
|Performance|Written in the native compiled language. Internally, the AST is compiled into a lower level IR, where jump labels are resolved to instruction offsets.
|===

View file

@ -1,3 +1,5 @@
:sourcedir: ../../src
[[section-building-block-view]]
@ -5,41 +7,29 @@
[plantuml]
----
() IO -> [Lexer]
[Lexer] -> [Parser]
[Parser] -> [Compiler]
[Compiler] --> [Interpreter]
[Interpreter] -> () IO
() IO --> [Interpreter]
() IO <- [Main] : read input file
[Main] <-> [Parser] : lex and parse code
[Main] <--> [IR] : compile
[Main] <--> [Interpreter] : interpret code
[Interpreter] <--> () IO : stdin and stdout
----
=== Whitebox Overall System
_**<Overview Diagram>**_
Motivation::
_<text explanation>_
The interpreter follows a classic interpreter architecture. First, the source is tokenized by a lexer, implemented using the `logos` library (https://crates.io/crates/logos).
The, a handwritten recursive descent parser parses the token stream. The abstract syntax tree is then given to a small compiler, that compiles it down to a smaller and more limited IR. It also resolves jump labels to offsets.
Contained Building Blocks::
_<Description of contained building block (black boxes)>_
The interpreter then executes this lower level IR.
Important Interfaces::
_<Description of important interfaces>_
==== Parser
Lexes the source code, and then turns those tokens into an abstract syntax tree.
==== <Name black box 1>
_<Purpose/Responsibility>_
_<Interface(s)>_
[source,rust]
----
include::{sourcedir}/parser.rs[tag=parse]
----
_<(Optional) Quality/Performance Characteristics>_

View file

@ -7,9 +7,30 @@
|===
|Term |Definition
|<Term-1>
|<definition-1>
|Source code
|A UTF-8 encoded string containing the program.
|<Term-2>
|<definition-2>
|Abstract Syntax Tree (AST)
|A tree-shaped structure that represents the source code of the programming language
|Token
|A single element of the source code, for example a number, punctuation or an identifier.
|Lexer
|Tokenizes source code into a stream of tokens.
|Parser
|Turns a stream of tokens into an AST.
|Span
|Marks the location in the source code where the element containing the span originated from.
|Intermediate representation (IR)
|An abstract representation of the source code that lives on some level between the source code and target.
|Crate
|The unit of compilation in Rust, often used to refer to a library.
|Compiler Diagnostic
|Compiler error.
|===

View file

@ -232,7 +232,9 @@ impl InterpretCtx {
}
}
pub fn interpret(stmts: Vec<Stmt>, _spans: Vec<Span>) -> Result<()> {
// tag::interpret[]
pub fn interpret(stmts: Vec<Stmt>) -> Result<()> {
// end::interpret[]
let mut ctx = InterpretCtx {
memory: vec![0; MEMORY_SIZE],
registers: [0; 16],

View file

@ -16,7 +16,7 @@ fn main() -> Result<(), io::Error> {
let stmts = ir::compile(ast.into_iter()).unwrap_or_else(|e| report_and_exit(&file, e));
dbg_pls::color!(&stmts.0);
interpret::interpret(stmts.0, stmts.1).unwrap_or_else(|e| report_and_exit(&file, e));
interpret::interpret(stmts.0).unwrap_or_else(|e| report_and_exit(&file, e));
Ok(())
}

View file

@ -294,7 +294,9 @@ where
}
}
// tag::parse[]
pub fn parse(src: &str) -> Result<Vec<Stmt>> {
// end::parse[]
let lexer = lex(src).spanned();
let mut parser = Parser {
iter: lexer.peekable(),