An examination of multiple solutions.
I am working on creating a safe programming language. One of the major points I need to address while converting source code to assembly is how to handle integer division. With floating point division, dividing by zero is valid. The result may vary depending on your processor, but it is guaranteed to be either NaN, +INF, or -INF. Integer division is another story altogether. When performing integer division, the processor will trap when dividing by zero.
Trap: A processor level exception. Conceptually they are similar to software exceptions. When the input could not be processed an error was thrown*. However, there is a catch. They will probably kill your program since they cannot be directly interacted with (unless you’re the OS). …
Data structures and methods compilers use to store and track information.
When processing source code, a compiler will produce tremendous amounts of data. Definitions of classes and functions; global variables, local variables, and external variables; the list goes on.
To store this information, compilers rely on a series of tables called symbol tables. Each type of data is stored in its own specialized symbol table. Some are formatted similar to an SQL database table, others rely on hash tables, and a few are simply lists. …
Selecting a name is a hard, yet exhilarating process. Brainstorming. Throwing it all out. More brainstorming. Narrowing down the list of possibilities. Getting too attached to one name or feeling like they’re all okay.
Over time, potential names will begin to jump out. As the list narrows, it’s important to make sure the name you select puts you on the path towards success. At this stage, it’s time to do your homework.
This is the first thing you should check. Before becoming attached to a name, it is best to verify the name is available. If the website name you want is already taken you should move on. Could you contact the current owner of the website and try to buy it? Yes. …
Scanners are a specialized tool to quickly identify patterns in text. They are a fundamental component in compilers, JSON parsers, and text processors. The scanner I am presenting is designed to process source code for a compiler, but the code and topics I will cover can be easily adapted for other uses.
Scanners use a series of patterns to produce lexemes. Each lexeme is a continuous substring of the input text. The lexemes are then selectively paired with a tag corresponding to the pattern. A lexeme-tag pair is called a token. There may or may not be a token for each lexeme depending on what patterns are relevant. …
An overview of the systems to go from source code to an executable program.
At some point every program we use was compiled by a compiler. From desktop apps to embedded software in a microwave. All programming languages, including assembly, are compiled.
A compiler is a program that translates text or other programs into a new program.
The output of a compiler can vary. Compilers can change the source code language (transpilers), produce bytecode for interpretation, or machine code for native execution. At their core, compilers are translators.
The complex systems of a compiler can be grouped into three stages. The front end is responsible for understanding the source code. It will read, validate, and transform its input into a common intermediate representation (IR) that will be used in later steps. The second step is the optimizer. The optimizer uses a series of passes to modify the intermediate representation. With each modification the final behavior of the program will remain the same, but its execution will be improved. The last step is the back end. The back end takes the optimized intermediate representation and converts it to the target output. The output can be a bytecode for use with an interpreter or native executable code for a specific machine architecture. …
The need for a fast, safe, and easily understood language.
From a naive perspective, a computer is nothing more than a very fast version of the old pocket calculators at the hardware level. However, computers are actually extremely complex in nature. The x64 format, which will be my primary focus initially, contains hundreds of instructions. The documentation from Intel (IA32) is only 5,038 pages long currently.
Now for the real question: Why am I crazy enough to want to make my own programming language?
I believe that a programming language should focus on the following criteria:
· Safety: The language must be inherently safe. People make mistakes. Programmers make mistakes. As long as unsafe code compiles there will always be bugs and security flaws. I recognize that unsafe code is a necessary for some tasks, but most code can be safe. In the few cases where code must be unsafe it can be marked and released for the safety constraints. …