This TypeScript Predicate Generator Leaves Zod In The Dust

Hey HackerNoon! Over the Christmas holidays I’ve wrote a new tool Type Predicate Generator for TypeScript developers that generates type safe predicates. They are really useful when you don’t want to trust the any type the JSON.parse() returns.

Here I’m gonna compare this new tool (Generator for short) to the three main kind of such tools. Yep, my tool is by far is not unique, but it still has some cool tricks up its sleeve.

How is Generator different?

tl;dr; it’s as type safe at it gets, generates static files with code for maximum compatibility, fast in many ways because it’s AOT and tiny, and comes with a handy unit test generator to be ~100% sure the produced predicates work as expected.

This story goes into a rather deep comparison of Generator to other runtime type checkers, giving also a broader overview of the related topics. It’s turning into a more analytical than “hey, my X is cooler than Y” article with each iteration.

Preamble

First and foremost, my sincere respect to all the tool makers, and especially to those involved in designing, implementing, testing, and documenting each and every tool I’m comparing Generator to in this post (Zod, Typia, ts-auto-guard, ts-runtime-checks). Keep up the great work, folks!

Comparing Generator to the pure runtime checkers

The most popular library in this category is the infamous Zod.

Type checkers in this category use only the code that the JavaScript engine runs within an application. No type information is used inside of this class of libraries. Some of them use eval to turn a schema into JS code at runtime, some like zod use pure function composition.

The main benefits of this approach is ease of use and flexibility of expressible rules (it’s built with turing complete JS after all). It is also relatively easier to evolve as it’s just JS/TS without the heavylifting of interfacing with the TypeScript API.

Now to the bread and butter of what Generator has to offer compared to the pure runtime checkers.

Generator produces code that is over 120 times faster

Check out the benchmark results.

The main reason is that Generator produces specialized code that is easy for all the modern JS engines to optimize. Each type predicate function consists of trivial instructions (like typeof x === "string" and x === "constant") that in most cases don’t even call any other functions and never any external or shared functions. JIT JavaScript engines like when code types are local and when for each distinct type there is a separate small function. This helps JS engines to specialize these small functions in runtime. See this amazing article for more details on how V8 deals with polymorphic functions.

Pure runtime checkers like zod instead use function composition. In this case, basic building block functions (like hasProperty(object, propertyName)) get reused with values of different types and thus different hidden classes. This leads to frequent deoptimizations (falling back to the slower non-optimized byte code instead of the faster native code). In the most severe cases, the JIT compiler might oscillate between optimizing and then deoptimizing a code path for one type and then another, spending a lot of cycles in just compiling the code instead of actually running it.

Generator does not have any runtime dependencies

The code that gets into the bundle is the exact code you see in the git diff output in your project after generating the type predicates. The only other code the predicate uses is the built-ins like Array.isArray() (safely wrapped!). This means that the bundle real estate Generator uses for the predicates stays minimal. It grows linearly with the size of types. The net amount of code used to define a Zod type is comparable to the size of an average predicate after minification.

Purely runtime type checkers by design are a runtime dependency. They require the runtime library of matchers to be included in the resulting bundle. Most of the libraries use handy constructors that are flexible enough to allow most of the checks a general API would need. This is great, but also affects the bundle size (60+ KB for zod) because the more generalized matchers cannot be tree-shaked statically (here is an interesting attempt on dealing with the bundle size).

Another emerging issue with the runtime-only complex libraries that are used in performance and availability-critical parts of the application is compile-time compatibility with the WinterCG runtimes. Heavy runtime libraries tend to target feature-rich runtimes like Node and throw in runtime for corner cases like error message generation, requiring more testing. This is not an issue if you’re on a widely supported platform like Node or the latest browsers, but if you’re investing in, say, Wasmer Edge, adding a seemingly trivial dependency might turn into turmoil.

Generator produces TypeScript code that is strictly type-safe

This is one of the key distinctive features of Generator. The code produced by Generator that gets into your app bundle first gets checked by your app’s TypeScript setup to verify it’s safe internally and matches the types being checked. Generator plays well here by producing strictly type-safe code that is going to compile even in the strictest configurations. What is also handy is that when you make a change to the type that has a generated predicate, the tsc reminds you to update the predicate too (by just re-generating it).

The purely runtime checkers cannot directly use the power of TypeScript to verify that the composed function has all the required checks in the right order. Even a proper TypeScript type predicate that does not use at least the satisfies type operator can easily fool itself by just returning true for any input value without making sufficient enough checks.

function isUser(value: unknown): value is User {
  return true; // TypeScript blindly trusts us here
}

There is a way for several trivial cases to trick TS into checking that the function body is a type predicate (by using function return type covariance property):

const isNumberError: (x: unknown) => x is number = (x) => {
  //  ~~~~~~~~~~~~~ Signature '(x: unknown): true' must be a type predicate.
  return true;
};

const isNumberOK: (x: unknown) => x is number = (x) => {
  return typeof x === "number";
};

Test it for yourself in TS Playground. This is a step in the right direction, but don’t expect TS to soon turn into a theory proving engine to just cover the type predicate strictness. It’s way easier to prove that a predicate is valid for a smaller subset of operators, this is sort of what Generator is internally, so maybe one day.

Of course, the production-ready libraries like zod use TypeScript internally to check the library’s code correctness and provide utility functions to infer types from the runtime building blocks. This helps with improving the code reliability by far compared to some purely JavaScript libraries. But even this still does not let the tsc of your project verify the final code correctness on its own. There is always some wrapping/linking/helping code that cannot be verified.

Generator produces code that is readable and modifiable

For the cases when there is a blocking feature that the generated predicate does not support or there is a bug in it that requires urgent fixing, the code produced by Generator is readable and can be immediately edited. As the type predicate code does not change often, there would be no pressure to send patches upstream to the Generator source code (even though highly appreciated!). Such a quick fix would be trivial to review in a tiny PR and remain local to the predicate in question, allowing the team to unblock without any sync dependencies on the Generator’s development process.

In the case of a runtime library, a fix would require a patch to the library itself to be able to mitigate the issue. Such a patch would touch all the code that is checked by the library and require more thorough testing. A short-lived fork might be an option here, but would require finding out how to run the build and publish pileline of a given tool.

Generator does not require defining types using a custom DSL

Generator takes in any type defined in any part of the application using just native TypeScript. Even a type from a third-party library or a different team’s public type that doesn’t use any runtime type checker. It’s just types, everything is compatible, and composable.

Contrary to this, by design, all of the runtime checkers provide a set of classes or functions that form the final type checker instance that is used to check the values. Most of them require inferring the resulting type from the resulting compound function. This effectively turns the runtime generators into DSL-first libraries (popular in the Ruby world) instead of being truly TypeScript first. This way, the focus shifts to a more schema-centric approach of consuming APIs where TypeScript is more of a powerful secondary tool than a primary target.

Generator produces code that is way faster to cold start

This is by design; the generated code is simply more performant to deal with for JS engines. JS engines have developed lots of tricks to make the initial parsing as fast as possible, especially including lazy parsing. The code that

Generator produces is native for such optimizations as it’s no different from any other production code that constitutes the rest of your app. It is also trivial to three-shake.

The runtime checkers approach by design requires some code to be run at the startup that forces the JS engine to parse, compile, and execute the library code no matter if the predicate is going to be used or not right away. It is possible, though, to wrap the runtime checkers in factory functions to mitigate the issue, but the runtime library code still has to get into the bundle and get parsed at the startup.

Note that there are JavaScript runtime environments that rely on heap snapshots (V8 snapshots, WASM-based JS engines) that allow you to make a snapshot of the JS engine memory after the app has been initialized. If this reminds you of a preforking network daemon, then it’s a really close analogy. These environments can help with significantly improving the cold start times for the runtime checkers but still at the expense of more memory usage as the final checker function has to be constructed in the heap even when not used.

Generator does not use `eval()`

Generator compiles the predicate code during the build step and does not require eval() and friends that are not available in high-performance and secure runtimes like Vercel Edge Runtime, AWS CloudFront, Cloudflare Workers and Akamai EdgeWorkers.

The runtime checkers that use eval() and friends in runtime to allow for the JIT optimizations introduce even more steps for the JS engine to make before the actual code of a type predicate can be run. The eval() approach requires the library to build a predicate source code as a string, usually combining it out of smaller strings that then have to be garbage collected, told the JS engine to parse it, and only then executed. While flexible, this approach introduces additional startup latency, memory consumption, and overall more demanding requirements to the runtime environment.

I should mention here that using eval() is a potential security issue. The resulting artifact can be considered by security audit tools as potentially vulnerable. Honestly, it should not be a real issue in the case of a type checker library, especially if the used library source code is frequently audited and comes from a trusted source.

Easier debugging and better stack traces

Every library from time to time introduces or reveals a bug. With Generator, the source code that might raise an exception is explicitly bundled with your app and covered with source maps. Using a step debugger on the generated code is the same experience as stepping through your own app code. The stack trace in the error reporting tool is going to be crystal clear too. Such a bug can rarely happen within a code like that Generator produces, but it’s still possible, e.g. when a Proxy object throws on property access or a getter returns an unexpected null, or a library misuses patching global prototypes.

A runtime library in most cases would produce a stack trace with minified symbols. This is especially visible if used with the new native support for TypeScript in Node.js, where there are no source maps in use at all.

Native IDE experience

Generator’s code is trivial to navigate to through every IDE’s “goto definition” feature. The hover types are also just the exact types used in your application, no added wrappers or renaming. Plus, the actual code of the type predicate is available a click away in case you’d need to modify it.

In the case of zod, you’d see the inferred types that cannot have other neat features like type-level JSDoc (property-level JSDoc works though) or type-level generics (runtime workaround exists but requires some effort).

Generator produces code that is easy to review

Another benefit of having explicitly generated code version-controlled in your repository is that it’s trivial to audit. Starting with all the changes immediately visible during PR review through GitHub-based code scanners. This also makes upgrading the Generator package trivial as all the changes the new version introduces into the generated code are immediately visible in the project’s git diff and are covered by the generated unit tests.

For the sake of completeness on the topic of safety and touching a bit the supply chain security, the users of a runtime library are still running a compiled-to-JS source code of the runtime checker that can theoretically be anything as the *.d.ts files don’t provide post-build verification for the *.js code that comes in the library bundle. So, for more safety and security-strict applications, the ability of having the actual code verified by the project’s set of security tools can be important.

Comparing to TypeScript compiler plugins

The most popular and feature-rich tool in this category is Typia.

The type checkers in this category generate the type predicates code during the build step. They hook into the tsc compiler as plugins and provide type-level helper types that get transpiled into the actual JavaScript code during the code generation stage of the tsc compilation pipeline.

The main strong sides of tools in this category is that it’s super handy to use (just add a keyword-type like is<MyType> and it magically works) and supports virtually any type TypeScript has to offer. This power comes from that a compiler plugin is in the end a Compiler Plugin: it can do anything and everything tsc can.

Generator does not require `tsc` plugins

Generator is a standalone tool that has its own TypeScript version bundled inside that does not interfere with your app’s TypeScript setup. Even if Generator breaks during an upgrade, the predicates code it has generated is already checked into your repository and is not going away. The code is rather static and does not strictly require Generator to even be part of the build pipeline; you can simply code the code from the Playground.

As TypeScript does not officially support plugins for tsc, the checkers that rely on hooking into tsc require patches to tsc itself. This is a blocker for most teams that want to run a vanilla TypeScript compiler for different reasons. Mostly, it’s reliability as TS is the safety net that proves the code will work in production.

Another issue with having tsc plugins is that major TypeScript updates tend to introduce breaking changes to the internal APIs. In an optimistic case, just breaks the plugins and errors out. In a more tricky case, the plugin is going to continue working but might produce invalid code. This, in its strictest form, requires the app developer to wait till all the plugins have upgraded their TypeScript support, upgraded their test suites, and published the updated package. In projects with strict SLAs, this might also require waiting some time till the community adopts the freshly published package.

I have to admit that tsc plugins provide elegant APIs that I personally like the most as a programming language enthusiast. But strictly speaking, TS plugins extend the language as a whole and are not in line with the course TypeScript and the rest of the community (see reasoning and comments here) has been taking for a long time. A bit sad, but otherwise, we would have gotten another JS fork (like CoffeeScript or Facebook’s Flow) by now.

Generator produces type-safe TypeScript

As mentioned above Generator produces TypeScript code that is strictly type safe and is checked by your tsc setup according to the safety rules of your project.

The TS plugins instead produce JavaScript code that is not type checked by the TypeScript compiler, meaning that this part of the code that your project runs is still purely JavaScript even though coming from the TypeScript compiler. Of course, the tool creators test the output code really well, but you still have to outsource the type safety to the external tool build pipeline.

Generator code is explicitly readable and modifiable

As mentioned above Generator produces code that is readable and modifiable.

To check what a tsc plugin-based checker actually produces, you’d need to extract the generated code from the compilation pipeline. That code is purely JavaScript, which would require you to manually check the produced code’s correctness.

The fact that the produced code is in most setups not checked into the repository makes the package upgrades a multi-step process. To make sure that the new version of the plugin-based checker produces comparable code, you’d need to save somewhere the current emitted code, upgrade the package, and run a diff. With Generator, this comes out of the box.

The only downside of that Generator produces files is the explicit build step. With a tsc plugin, there is no code to worry about; the plugin keeps the predicates always up to date. Still, not all the build and development tools support hot reloading with full tsc plugin support (example issue that took almost a year to fix).

Generator code shareable across project boundaries

It’s just a tiny TS file that any JS ecosystem tool can consume as is, along with the rest of your application code.

If you needed to share your TypeScript source code that uses tsc plugins with other teams, you’d have to first publish it. This requires compiling the code down to a JavaScript bundle and publishing along with the *.d.ts files. Another way is to make other teams use the same set and versions of tsc plugins, including the checker plugin in their setup too. This requires some coordination and might slow down migrations to the newer TypeScript versions. Another blocker might come from the teams that are not building their project with tsc and instead use the new tools that natively understand TypeScript syntax and do not require a separate build step (Node.js native TS support, esbuild, Cloudflare wrangler).

Generator emitted code is easy to debug

As mentioned above in Easier debugging and better stack traces, Generator emits code that is friendly to debuggers and error reporting tools. The stack traces are native to your application, giving instant feedback on where the error originates from (both the validation errors and potential runtime errors).

In the tsc plugin case during debugging and reading stack traces, the source maps are going to point only to the single symbol in the source TS code (in most cases, the is<MyType> token). In case of an issue or a bug with the source code or the checker itself, it’s required to inspect the raw JavaScript bundle instead.

There is a workaround though. It should be possible to publish both source maps for your app’s bundle and for the dependencies and set the error reporting tool up to look for the source maps there. Quick googling did not show definitive results on how to deal with transitive source maps though.

Generator supports all the JS/TS tools

The predicate code is easy to use with the new Node.js TypeScript strip feature. No need to use ts-node or precompile the code with tsc. The same applies to the various test runners and edge runtime bundlers.

With a tsc plugin, you’re locked within the tsc-centric infrastructure. It’s not something particularly challenging but might require additional setup and degrade performance. For example, Vite transpiles *.ts files 20-30x times faster with the default esbuild compiler than with tsc. The new super fast ts-blank-space package does not even run TypeScript type analysis that is required for tsc plugins to work and, through this, is also blazing fast.

So, once again, extending the type system that affects resulting syntax effectively turns this category of checkers into a language extension.

Comparing to the other code generators

The most complete (and dear to my heart as I’ve used it in the past) tool in this class is ts-auto-guard.

This class of checkers produces the final predicate code as distinct static files that should be explicitly imported into the application code and built with the rest of the code. Code generators fall into this category of runtime type checkers.

Most of the arguments above in favour of using Generator more or less apply to all of the tools in this category. The main benefits of this class of tools is compatibility and predictability. This very much resonates with the approached to tooling taken in more pragmatic ecosystems like Golang, compiler development and such.

Generator produces type-safe TS

As mentioned above, Generator produces TypeScript code that is strictly type-safe.

Other code generators that also produce TypeScript emit code that is not type-safe. The not-safe emitted code uses unsafe constructions like type casting (with as) or can rely on the unsafe types (mostly any). This effectively turns the produced TypeScript code into loosely typed JavaScript. If your application’s setup uses strict TypeScript linters, the produced code would require adding exceptions and lower the overall type safety score.

Generator produces readable code

As mentioned above Generator produces code that is readable and modifiable. In addition to this, compared to other code emitting tools, Generator helps you with reading, debugging, and potentially manually changing the generated code by linearizing all the checks. It makes small steps that are easy to navigate and adds comprehensive comments (coming soon!). This means that the produced code is not minified but is still organized in the way that common bundlers minify can easily turn it into a tiny combined if expression anyway. Generator also tries to use meaningful local variable and temporary type names where possible to improve the reading experience.

The code produced by most of the other tools in this category is pre-optimized and thus not really readable. This gives better control to the tool maker to achieve the best performance as not all the bundlers can infer how to deal with rather complex predicate functions for the more complex types.

Generator always produces correct code

As the code Generator emits is strictly checked by the TypeScript compiler and also is emitted using the TS AST builder API (see the awesome ts-ast-viewer.com to play with the API). This makes it impossible for Generator to produce invalid TypeScript.

During my research, I’ve found that for some nontrivial cases, other checkers that use string concatenation can produce invalid or incomplete code. Luckily, the invalid code is immediately rejected by the TypeScript compiler (this is the superpower of the code-generating tools in general), but the incompleteness is not covered as the produced code is not type-safe. This also applies to checkers that rely on eval() at runtime.

Generator code is fast

Small disclaimer here. Most of the type-to-code generators show significantly (100x) better performance than purely runtime solutions and sensibly better startup times compared to those that use eval. But Generator still has some tricks up its sleeves.

What Generator adds to the table is accessing object properties only once, broadly reusing local variables, that makes the code extremely fast even in non-optimizing JS engines (this traces back to the techniques used to make old JS engines faster). Also, this trick potentially simplifies analysis for the modern JS engines.

Most other checkers form full or partial nested property access expressions that take some time for the JS engine to identify and optimize as it’s just more bytecode to analyze:

function isUserProperties(v) {
  // …
  return typeof v.a.b === "object" && v.a.b !== null;
}

// node --print-bytecode --print-bytecode-filter=isUserProperties bytecode.js
// GetNamedProperty a0, [0], [0]
// Star0
// GetNamedProperty r0, [1], [2]
// TestTypeOf #7
// JumpIfFalse [13] (0x22bd65001e68 @ 24)
// GetNamedProperty a0, [0], [0]
// Star0
// GetNamedProperty r0, [1], [4]
// TestNull
// LogicalNot
// Return

function isUserLocals(v) {
  // …
  const b = v.a.b;
  return typeof b === "object" && b !== null;
}

// node --print-bytecode --print-bytecode-filter=isUserLocals bytecode.js
// GetNamedProperty a0, [0], [0]
// Star1
// GetNamedProperty r1, [1], [2]
// Star0
// TestTypeOf #7
// JumpIfFalse [6] (0x5e08e641f02 @ 18)
// Ldar r0
// TestNull
// LogicalNot
// Return

There is no second GetNamedProperty in the local variable code. It uses one more register, though.

Generator also tests the generated code

As a bonus, during the predicates generation, Generator can also optionally emit a load of unit tests for the generated predicates. You can run these tests as part of your application’s test suite.

At the moment, this is a rather unique feature that I hope is going to be picked up by other tool makers. Read more about the approach here.

Is it better than ChatGPT?

Yep, it is (for how long?).

As is going more important these days with AI producing more and more content, with Generator, you can trust the produced code as it’s a trivial tool that is too simple to hallucinate. Generator responds with an error to the types it cannot convert instead of producing incorrect code and convincing you it’s correct.

I tried Copilot and ChatGPT. Both AI tools produced unsafe TS that could not even compile without errors at first, no matter what prompt I used. On one of the runs, ChatGPT simply broke in the middle of the code generation with a set of completely out-of-context symbols.

And as it becomes more visible, Generator is way faster than any AI tool. Ah, I forgot to mention, Generator is also free.

Summary

As you can see there are several use cases where it’s better to challenge the status quo and go full type safe and static.

Generator is also a tool that is dead simple inside by relying on other tools to do the rest of the heavy lifting: to produce the code, Generator uses TypeScript API, for minification, expects the esbuild with default settings, for type checking, obviously, makes the produced code strictly typed, and uses the satisfies operator. In addition to this, Generator augments it all with emitting a set of unit tests to test the other generated code. Also, the feature set is minimal.

This TypeScript Predicate Generator Leaves Zod in the Dust | HackerNoon

How is Generator different?

Preamble

Comparing Generator to the pure runtime checkers

Generator produces code that is over 120 times faster

Generator does not have any runtime dependencies

Generator produces TypeScript code that is strictly type-safe

Generator produces code that is readable and modifiable

Generator does not require defining types using a custom DSL

Generator produces code that is way faster to cold start

Generator does not use `eval()`

Easier debugging and better stack traces

Native IDE experience

Generator produces code that is easy to review

Comparing to TypeScript compiler plugins

Generator does not require `tsc` plugins

Generator produces type-safe TypeScript

Generator code is explicitly readable and modifiable

Generator code shareable across project boundaries

Generator emitted code is easy to debug

Generator supports all the JS/TS tools

Comparing to the other code generators

Generator produces type-safe TS

Generator produces readable code

Generator always produces correct code

Generator code is fast

Generator also tests the generated code

Is it better than ChatGPT?

Summary

Leave a Reply Cancel reply

Stay Connected

Latest News

Trump vows sweeping tariffs on oil, pharmaceuticals and semiconductors

What to Know About Collision Avoidance Systems on Planes

Microsoft forms Advanced Planning Unit to support its AI efforts – News

Apple reportedly gives up on its AR video glasses project

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

Topics

Sign Up for Our Newsletter

How is Generator different?

Preamble

Comparing Generator to the pure runtime checkers

Generator produces code that is over 120 times faster

Generator does not have any runtime dependencies

Generator produces TypeScript code that is strictly type-safe

Generator produces code that is readable and modifiable

Generator does not require defining types using a custom DSL

Generator produces code that is way faster to cold start

Generator does not use eval()

Easier debugging and better stack traces

Native IDE experience

Generator produces code that is easy to review

Comparing to TypeScript compiler plugins

Generator does not require tsc plugins

Generator produces type-safe TypeScript

Generator code is explicitly readable and modifiable

Generator code shareable across project boundaries

Generator emitted code is easy to debug

Generator supports all the JS/TS tools

Comparing to the other code generators

Generator produces type-safe TS

Generator produces readable code

Generator always produces correct code

Generator code is fast

Generator also tests the generated code

Is it better than ChatGPT?

Summary

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Stay Connected

Latest News

Generator does not use `eval()`

Generator does not require `tsc` plugins