Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

(1) DSLs work great sometimes. See https://www.jooq.org/

(2) Elastic Load Balancer is a control loop responsive to workloads, that kind of thing is a commodity

(3) Under-provisioning is rampant in most industries; see https://erikbern.com/2018/03/27/waiting-time-load-factor-and... and https://www.amazon.com/Goal-Process-Ongoing-Improvement/dp/0...

(4) Anomaly detection is not inherently a problem of distributed systems like the others, but someone facing the problems they've been burned with might think they need it. Intellectually it's tough. The first algorithm I saw that felt halfway smart was https://scikit-learn.org/1.5/modules/outlier_detection.html#... which is sometimes a miracle and I had good luck using it on text with the CNN-based embeddings we had in 2018 but none at all w/ SBERT.



I've written two DSLs (one with a team) and I'd consider them both successful. They solved the problem and no one cursed them out. I think the most important factor is they were both small.

They were very similar. I even reused the code. One was writing rules to validate giant forms, the other was writing rules to for decisions based on form responses.

Ok, just ranting on DSLs. Good DSLs take someone from can't to can. A DSL that's meant to save time is way less likely to be useful because it's very likely to not save you time.

In both of my DSLs, it's that we needed to get complex domain behavior into the program. So you either need to teach a programmer the domain, partner a programmer with a domain expert, or teach a domain expert how to program.

Putting the power in the hands of the domain expert is attractive when there's a lot of work to be done. It frees up programmers to do other things and tightens the feedback loop. If it's a deep domain, it's not like you want to send your programmer to school to learn how to do this. If it's shallow, you can probably have someone cheaper do it.

A DSL comes with a lot of cognitive overhead. If the other option is learning a full programming language, this becomes more reasonable.

A time saving DSL is where someone already knows how to write code, they just want to write less of it. This is generally not as good because the savings are marginal. Then when some programmer wants to change something, they have to learn/remember this whole DSL instead of more straightforward code.

Actually, this makes a simpler rule of thumb. A DSL for programmers is less likely to be a good idea than a DSL for non-programmers.


DSL are just great, have never failed for me. I have done it several times, the last one just the past year, for programming complex ASICs. I‘ve seen uncountable times working like a charm.

I cannot understand why seems it was bad for him…


I think there are two big reasons DSLs often underdeliver. The first is not understanding the cognitive overhead budget. If it's something where it's being used infrequently or by a lot of new people, that's a lot of overhead to be spent each time. Sometimes people think of writing DSLs for tests because they have to write a bunch, but it's really easy for this to suck. I have a test turn red and now I have to learn a DSL to deal with it? Ew.

The second is fuzzier. It's putting a DSL over something complex and hoping this will fix things. Writing SQL queries for this system takes a bunch of time and is error prone? Just put a DSL over it! Except all those details and errors are probably going to leak right through your DSL.

You want to master the domain before you put a DSL over it.


>> The first is not understanding the cognitive overhead budget. If it's something where it's being used infrequently or by a lot of new people, that's a lot of overhead to be spent each time. Sometimes people think of writing DSLs for tests because they have to write a bunch, but it's really easy for this to suck. I have a test turn red and now I have to learn a DSL to deal with it?

What is the alternative to the DSL with lower cognitive load? I do not follow. Every single DSL I’ve seen REDUCES the cognitive load, by allowing to express the concept in the mere language of the problem at hand, for which the SME should be more than familiar with.

About the second point: I see many critics in this thread based on DSls above SQL. Whatever somebody is doing above SQL and selling as a DSL, it is not. Period. I cannot think in any possible way of doing a DSL above a query language. No doubt people hate the idea. Is a BAD one.


> What is the alternative to the DSL with lower cognitive load?

In the test example, writing it directly in the programming language. This will usually lead to code that is more verbose and repetitive, but understanding the first example will be faster.

I think of cognitive load like a line. X is the number of cases you’re working with, Y is cognitive load [0]. For someone who already knows a programming language, the DSL is going to have a higher Y intercept since you have to learn something new before you understand the first case. Hopefully, it’s a shallower slope so as you deal with more cases the upfront cost gets paid back. If you have lots of people dealing with one case or doing it infrequently enough they have to relearn each time, this payoff never happens.

This model extends past DSLs to all abstractions. It’s why people often end up happier with test code that’s less abstract/DRY. The access pattern supports it.

Looking at it this way also explains why a DSL for a non-programmer is more likely to be useful. The intercept can be lower than an actual programming language, so you’re ahead from the start.

[0] It’s really more of a curve, but the line model works conceptually.


These examples sound like things that aren’t really DSLs… Or in other words it sounds like someone is trying to make something “simpler than it actually is”.

DSLs are supposed to be for making it easier to perform computation in a specific context. Software tests have about as many degrees of freedom as the programming language they are written in, so I’m not sure they are an ideal use case for a DSL— not without a lot of discipline at least.

For a DSL to make sense, IMHO, you need to be able to write down a complete and correct specification for it. I doubt that is even possible in the given examples :shrug:


Are your DSLs used by other people and do they share your opinion? In my experience DSLs are nice to work with for the creator, but it’s much more work in documentation, training, intelligible error handling, and so on, to make a DSL that’s easy for others to learn and use.

I do like DSLs, but the value proposition is often difficult, IMO.


There's a difference between "works once for my very specific problem" and "works most of the time for a wide range of problems".

DSLs, in my experience, usually fail in the later definition. It's very hard to make a small language that precisely captures its domain of application, will produce easy to manage programs no matter the size, would be easy to analyze in terms of performance and side-effects.


There are hundreds of DSLs for ASIC design but not a single one of them has ever been used for actual tapeout. It's 100% unheard of. Hence, I doubt your DSL saved any time over using an RTL language directly. Sorry for sounding harsh, but if you work in the area you understand my skepticism about ASIC design DSLs.


> I cannot understand why seems it was bad for him…

There are many poorly designed libraries, and DSL design is no easier. While I haven’t personally encountered any, I’m sure there are numerous half-baked DSLs out there.


Here's a few DSL (for programmers) examples that might make sense (the context is embedding DSLs into Python literal strings): https://news.ycombinator.com/item?id=41214365

For example, bash, SQL DSLs may be immediately useful by protecting against shell,sql injection: shutil.run(sh"command {arg}") may translate to subprocess.run(["command", os.fspath(arg)])

No shell--no shell injection. The assumption is that it enables sh"a | b > {c}" syntax (otherwise just call subprocess.run directly). Implementing it in pure Python by hand would be more verbose, less readable, more error-prone).


Yes, there are definitely counterexamples. It’s not black and white at all.


I think the theory of domain-specific languages is very valuable, even if there's rarely a need for a full implementation of one.

As I see it, a DSL is just the end-state of a programmer creating abstractions and reusable components to ultimately solve the real problem. The nouns and verbs granted by a programming interface constrain how one thinks, so a flexible and intuitive vocabulary and grammar can make the "real program" powerful and easy to maintain. Conversely, a rigid and irregular interface makes the "real program" a brittle maintenance nightmare.


I agree. The line between a DSL and regular old programming abstractions is fuzzy. Learning some language design is very helpful because you’ll see every abstraction is a little piece of language.


>>> A DSL for programmers is less likely to be a good idea than a DSL for non-programmers.

Nail on the head time - somewhere else in the thread is jooq which is (yet another) SQL DSL where you end up with from(table).where(student=bob)

This is a perfect example of why the programmer should (just?) learn SQL instead of the DSL - and your comment nails it


(1) JooQ shines for query generation. For instance at work we have a search engine that can search maybe 50 fields that are mostly materialized but involve the occasional subquery. Also you can easily write a template method like

   transitiveQuery(selectClause,relationshipGenerator,where1,...)
and have type inference do the right thing in the compiler and IDE and all of that. You get "hygenic macros", you can write an Java class which is parameterized by one or more related tables which can be specialized by adding more parameters, subclassing, etc.

(2) Circa 2005 coding PHP I came to the conclusion that the ORM I needed was

   insert(table,{col1: val1, col2:val2, ...})
because writing simple inserts and updates against SQL is for birds, let freshers do it and they will forget to escape something. Such a "framework" can be so simple that you can bend it to the needs of your application. JooQ gives you something like that but backed by the Java type system.


I do get it (I think!) - but there is a world of difference between “I have 20 years SQL experience and do not want to spend hours maintaining 200 SQL templates, and believe the overhead of this DSL is worth the trade off” vs “use the DSL and you won’t have to teach junior Devs SQL!”

My comment is more aimed at the second part. SQL Is tied to the implementation and demands coders understand it all. A DSL can allow domain experts to express their understanding without having to worry about software trade offs.

The most successful “DSL” I know of like this is fitnesse tests - just a large number of simple tests where domain experts can spreadsheet style throw in the “gotchas”.

Something like that but more spool is fixated is a holy grail - Behaviour driven tests like cucumber come close but there is that weird intermediate translation from English phrase to random function - now you have to understand the function to use the phrase and suddenly you are reading real Code to be able to use the fake code and it never feels clean

One day I will be clever enough to be able to write a really good test DSL

It’s just whenever I think of “Given used is logged in, visit “textbox” and enter “word” .. it just looks like BDD test not a DSL. Like I said, one day I will be clever enough


Sure. Note though that there's a long tradition of systems for embedding SQL in conventional programming languages such as

http://infolab.stanford.edu/~junyang/cs145/or-proc.html

which for whatever reasons never caught on in the open source world. (I'd blame limitations of current compiler technologies and the values of people who make compilers... If we had composable parsers you could just say "here's a spot for a SQL query in a Java method" in 10 lines of code) JooQ approaches that without requiring any change in the compiler. In the past it was awkward to embed SQL in Java because there were no multi-line strings. In Python you could write

do_query(" ... a really crazy complicated queries with lots of joins and subqueries that is carefully indented to fit in with the rest of the program ... ",{"arg1": val1, "arg2": val2})

but without real map literals, multi-line strings and such this was terribly awkward. (If you think List.of(), Map.of() and such are cool I was writing a computer chess program last month that used List.of(A,B) to create a list that was used in an inner loop and it was terrifying how slow it was compared to using an ArrayList)


I'm not familiar with jooq, but I've used Ecto a ton, and the point was never to avoid learning SQL. It's about making queries composable and mapping to domain objects automatically so eg. there aren't dozens of queries to update when you add a field to a domain object.


jOOQ is a disaster and I would not recommend it to anyone.

You write some SQL queries, test them in datagrip or whatnot, then spend the next several hours figuring out how to convert them to the DSL. This problem is compounded when you use "exotic" SQL features like json expressions. Debugging is "print the generated sql, copy it into datagrip/whatnot, tune the query, then figure out how to retrofit that back into the DSL".

It's a huge waste of time.

The primary selling point of jOOQ is "type safe queries". That became irrelevant when IntelliJ started validating SQL in strings in your code against the real data. The workflow of editing SQL and testing it directly against the database is just better.

jOOQ reinforces the OP's point about DSLs.


This is a very specific and popular subset of the DSL point: Let's just invent a language L that is better than horrible standard language X but translates to X. Imagine the vast cubicle farms of X programmers who will throw off their chains and flock to our better language L!

In many scenarios (including JOOQ and all ORMs), X is SQL. I should know, I spent years working on a Java-based ORM. So believe me when I say: ORMs are terrible. To use SQL effectively, you have to understand how databases work at the physical level -- what's a B-tree lookup, what's a scan, how these combine, etc. etc. You can often rely on the optimizer to do a good job, but must also be able to figure out the physical picture when the optimizer (or DBA) got things wrong. You're using an ORM? To lift a phrase from another don't-do-this context: congratulations, you now have two problems. You now have to get the ORM to generate the SQL to do what really needs to be done.

And then there are the generalizations of point made above: There are lots of tools that work with SQL. Lots of programmers who know SQL. Lots of careers that depend on SQL. Nobody gives a shit about your ORM just because it saves you the trouble of the easiest part of the data management problem.


This is an odd take. Your programming language works with objects, the data is in relational tables, you need software to map the relations to objects. Thus the Object Relational Mapper. There's no reason you can't write SQL and let an ORM handle the result set mapping.


If that’s all you were doing, then maybe, but it never is. ORMs enable people who have no idea how RDBMS works to use them, which rarely ends well.

I’m not suggesting that to use RDBMS you should know how to administrate and tune it (though it helps), but knowing their language, and understanding a single data structure (B+ trees) isn’t too much to ask, I think.


> ORMs enable people who have no idea how RDBMS works to use them, which rarely ends well.

In some cases, but the more frequent issue I saw back in the day was the DBA making some really complex schema tuned for what they wanted, then an application trying to use the data in a pretty reasonable OOP manner (1 to many relationships, etc) and the DBA pissed they were using an ORM instead of their perfect SQL queries and procedures.


> the DBA pissed they were using an ORM instead of their perfect SQL queries and procedures.

Tbh, I don't understand why this is seen as a bad thing. Correction: I know why it is (any changes are obviously going to be dramatically slowed down), but in the long run, I don't understand why people are against it. You wanted something done correctly, so you went to the SME for that specific field, and had them do it for you. Then you decided to throw it away?! Why are you bothering to ask them in the first place?

> 1 to many relationships, etc

I know this was just an example, but 1:M is a perfectly natural part of any RDBMS, and in no way requires an ORM to be done.


> Then you decided to throw it away?! Why are you bothering to ask them in the first place?

Usually this was a mismatch of mgmt or expectations. Hiring old school DBAs and letting them think they "own the data", while plopping them into a huge dev team changing the big SaaS features daily is a recipe for trouble.

I don't fault DBAs per se, though I did work with some who wouldn't look outside their blinders at all.


Of course. And after you understand SQL and databases, ORMs can save you a lot of typing. I've never understood the either/or attitude.


Sometimes it can. Other times, you have the SQL already written in your head, but then you have to figure out how to coerce the ORM to doing what you want.


Even Hibernate has `em.createNativeQuery("type your sql here", SomeResult.class)`. I've never seen an ORM (for an RDBMS) that didn't make it easy to run SQL.


Then what's the point of using Hibernate? Just use the ODBM driver... why are you dragging the gorilla and all of the jungle with you if all you wanted was a banana?


Since we're talking about Hibernate, I assume you mean the JDBC driver? Because the API is tedious and unpleasant.

The mapping of database results to java objects with Hibernate is convenient. The basic "load entity, change a couple fields, let Hibernate persist it" flow is convenient. In a limited set of cases, basic entity graph navigation is convenient.

As I said, if you're working in an object-based language, by definition you need something that maps relations to objects. Hibernate is a competent choice. There are other competent choices, but JDBC is not one of them unless your app is trivial.


Yeah, I confused multiple acronyms here :)

Anyways. Hibernate works on top of JDBC, so, if you like its interface, then it means you could make your own, but skipping >99% of the rest of Hibernate code that has nothing to do with wrapping the driver.

Or, imagine there was a library Hibernate', that threw away all the ORM stuff, and only offered mapping of SQL results to Java objects and sending queries to the database. Then, why not use Hibernate' instead of Hibernate?

NB. About triviality. From experience: trivial apps tend to work OK with ORM. Non-trivial will usually ditch the ORM because of performance, missing functionality and general difficulty with servicing it. So, it's the other way around: if you are shooting for the stars, you are probably not going to use Hibernate, Hibernate is one of the variety of tools that helps losers loose less, it's not a tool for the winners.


What you've said makes no sense. The "ORM stuff" is what I want, the Object Relational Mapping. Taking relational data and converting it back and forth to objects. And Hibernate is actually pretty good at this.

I think you've built up a strawman in you mind of what you think "ORM" is. Yes Hibernate is huge and has a lot of features that people shouldn't use. But you can say the same about Microsoft Word, the problem is that everyone uses a different 5% of the huge feature set.

People who work with these technologies on a daily basis don't screw up the core acronyms. I suggest softening your opinion and dropping the platitudes.


It's clear that you want ORM. But you didn't explain why you want it. For all I know, you like to suffer, and that's why you want it, but you've made no compelling argument for people who don't like to suffer to use ORM.

BTW. I'm absolutely on-board with you: nobody should use Microsoft Word. There's absolutely no reason to do that. It's a marketing ploy with a lot of grease money paid to people in charge of procurement in various places. It's absolutely not about 5% of features. It's just downright worst kind of text editor that's in popular use today. Ask me how I know this? I worked in a newspaper! Somehow, Microsoft never ventured into this field, and didn't sell their garbage there. And nobody uses Microsoft in book publishing or any other sort of publishing. Not for any % of its features. So much so that if you bring a manuscript (as an outside author) to publish a book or an article in a newspaper / magazine, and it will be in MS Word format, you'll be most likely asked to convert it to another format. And we are talking about people who need a lot of different features of text editing!

And, I really don't care about what you have to suggest. You aren't in a position to make suggestions really ;)


Heh. Nice. I like your zinger.

My go to metaphor has been "XYZ is an angry 800lb gorilla sitting between you and your work."


> you need software to map the relations to objects

If you start with a network data model perspective and build that into your system, then it follows that you'll want a network data model to SQL mapper. That's what ORMs are, and the need for them comes from your approach, not from the tools you use.

There's a different approach - use OOP to build computational abstractions rather than model data. Use it to decompose the solution rather than model the problem. Have objects that talk to the database, exchange sets of facts between it and themselves, and process sets of facts. In the process, you can also start viewing data relationally - as n-ary relations over sets of values - as opposed to binary relationships between tables of records.

Information systems are not domain simulations, simulations compute the future state of the domain whereas information systems derive facts from known facts at the present time.

For a visual metaphor, car engineers don't use roadmaps as design diagrams and they don't model the problem domain in the systems they build. A car isn't built from streets, turns, road signs, traffic lights, etc. And despite that, cars function perfectly well in the problem domain. A car generally doesn't need to be refactored and reassembled when roads or rules change.


Nah. That's an inconsequential part of the interaction between the database and the application. The reality is that your code has both, the database and the application. And if you want to write good software, you need to know how both work and be an expert at that.

It's infinitely easier and less error-prone to keep the interface between the database and the application to the minimum (just convert the final results of a query to the application objects and embed complete queries in the application code) than to try and create complex query builders behind the scenes of object-to-object interaction.

If you want to make a good product, you may start with ORM, as it may, for a time, delay the need of understanding the relationship between the application and the database, and allow you to experiment faster at the expense of lost performance. Once you know what you need to do, ORM just no longer works: you will have to break it at least in order to deal with performance issues, but often you will also find yourself dealing with the fact that a lot of what you want to express in your queries is either too difficult or even impossible to express in a particular ORM.


I used MySQL before I understood it at the physical level, and now I'm using some other ones that I don't really understand. A MySQL/Postgres noob can get pretty far just knowing to avoid seq scans. It's not ideal, but it'll work. Understanding schema design is more important.

The thing is, ORMs encourage bad schema design and get in the way of the SQL you want. I've seen entire projects ruined this way. I think the only valid reason for an ORM was before RDBMSes had json etc types. Maybe you had a table with very many cols that you just want to get/set, say a "user profile" table. This also contributed to the NoSQL fad. Nowadays you can throw that into one json col.


> To use SQL effectively, you have to understand how databases work at the physical level -- what's a B-tree lookup, what's a scan, how these combine, etc.

This is a good reason to use an ORM. But also, as a ORM designer, don’t let the ORM be flexible to do any SQL. Only let it do performant data access.


Yep, abstracting away SQL is a common and very costly mistake. The article is about more general system design, otherwise I would have expected to see that in the list.


I've never seen a good DSL beside something like regular expressions, and even there, I hear, a lot of people are upset by the language.

Examples of popular DSLs that I would characterize as bad if not outright failures:

* HCL (Terraform configuration language). It was obvious from the very beginning that very common problems haven't been addressed in the language, like provisioning a variable number of similar appliances. The attempts to add the functionality later were clumsy and didn't solve the problem fully.

* E4X (A JavaScript DSL for working with XML). In simple cases allowed for more concise expression of operations on XML, but very quickly could become an impenetrable wall of punctuation. This is very similar to Microsoft's Linq in that it gave no indication to the authors of how computationally complex the underlying code would be. Eventually, any code using this DSL would rewrite it in a less terse, but more easy to analyze way.

* XUL (Firefox' UI language for extending the browser's chrome). It worked OK if what you wanted to do was Firefox extensions, but Firefox also wanted to sell this as a technology for enterprise to base their in-house applications on Firefox, and it was very lacking in that domain. It would require a lot of trickery and round-about ways of getting simple things done.

* Common Lisp's string formatting language (as well as many others in this domain). Similar to above: works OK for small problems, but doesn't scale. Some formatting problems require some very weird solutions, or don't really have a solution at all (I absolutely hate it when I see code that calls format recursively).

All in all. The most typical problem I see with this approach is that it's temporary and doesn't scale well. I.e. it will very soon run into the problems it doesn't have a good solution for. Large programs in DSL languages are often a nightmare to deal with.


I'm always baffled by hate for DSLs until I realize that what people are criticizing aren't DSLs, but DSLs you have to write from scratch. If you host your DSL on Lisp, then all you have to write is your domain logic, not the base language. Most of the work is already done, and your language is useful from day one. I don't understand why people insist on creating new languages from scratch just to watch them die on the vine, when these langs could have been hosted DSLs on Lisp and actually get used.


Not just Lisp, but any language that has strong support for either literal in-language data expressions like JSON or YAML, or meta-language support like Ruby, Elixir, JSX/TSX (or both!).

Every time you write a React JSX expression, terraform file, config.yaml, etc., you're using a DSL.

I once wrote a JSON DSL in Ruby that I used for a template-based C# code generator. This enabled a .NET reporting web app to create arbitrarily shaped reports from arbitrary rdmbs tables, saving our team thousands of hours. Another team would upload report data to a SQL Server instance, write a JSON file in the DSL, check it against a tiny schema validator website, submit it, and their reports would soon be live. One of the most productive decisions I ever made.


Technically yeah, but JSX isn't what people think of when you mention a DSL. I know JS, I know HTML, so I know JSX immediately since it's just templatized HTML inside JS.


This is generally a terrible way to work. Making a bunch of custom syntax even in the same language is just adding more stuff to memorize for no gain.

Even in C using the "goes to operator" of while(i --> 0) or using special operator overloading like the C++ STL >> and << operators for concatenation is just making people memorize nonsense so someone writing can be clever.

People don't give presentations with riddles and limericks either. It can be clever as a puzzle but when things need to get done, it is just indulging someone showing off their cleverness at the expense of everyone who has to deal with it.


I think you misunderstood what a DSL is, or at least the point of the OP?

We are advocating exactly to keep the syntax the same as the base language, and add semantic value through the abstractions of the language.


I didn't misunderstand anything.

If you're not changing any syntax and are just using normal function calls, that's an API and that's direct.

If you're not just using normal function calls and are making your own "semantic value through abstractions of the language" you aren't making something that is direct and are creating something that needs to be memorized.

The cleverness and indirection of the new stuff that hides what is really going on is 99% of the time not worth what it gives you, because you have to memorize this new clever thing someone came up with, then you have to learn what it is actually doing underneath that is being hidden.


> If you're not changing any syntax and are just using normal function calls, that's an API and that's direct.

No. Sorry. Wrong. Look SICP where they explain the concept of embedded DSl. Hint: you may be conflating syntax and language.


Everyone understands the concept. Understanding why you shouldn't do it is what takes experience.

If you look at the source code for doom it is very straight forward. No fancy stuff, not cleverness, no pageantry of someone else's idea of what "good programming" is, just what needs to happen to make the program.

I'll even give you an example of an exception. Most for loops in C and successors are more complicated than they need to be. Many loops are looping from 0 to a final index and they need a variable to keep track which index they are on. Instead of a verbose for loop, you can make a macro to always loop from 0 and always give you an index variable, so you just give it the length and what symbol to use. Then you have something simplified and that's useful. It's shorter, it's clear, it will save bugs and be easier to read when you need nested loops through arrays with multiple dimensions.

I already gave examples before where clever extra syntax creates an exceptional situation but gains nothing.

The fundamental point here is that these opportunities are rare. Thinking that making up new syntax is a goal of programming is doing a disservice to everyone who has to deal with it in the future.


You are 100% right in all, except you are talking about syntax extensions (the for example) and not DSL. A DSL does not need a new syntax, is a collection of abstractions that allow to express problems in the language of the domain problem. It is not an API, because is not an interface for a functionality. Is not about exposing functionality, but to add semantic value to the upper layers. May (not necessarily, but may) be formed by a collection of functions, in that case similar to an API, in that sense. Sometimes may include indeed extensions to a language, but in that case by the standard means of abstraction preferred in that language: clases, templates, functions, structures. The key is to reduce the cognitive load for the end programmers, who could be expert in the problem at hand, but not in the underlying language of the embedding.

There is also the possibility of embedding in a non programming language, like XML (E.g. launch language in ROS), or S-exp in the Oracle listener config file. Also you can do ad-hoc like in the .msg files of ROS. But is always about semantics, not syntax. Syntax is the medium only.


Is not about exposing functionality, but to add semantic value to the upper layers.

Sometimes may include indeed extensions to a language, but in that case by the standard means of abstraction preferred in that language: clases, templates, functions, structures.

You keep saying that there are no problems and that it isn't like anything mentioned but you don't have any examples.

What is an example of "adding semantic value" that isn't using the languages normal constructs but is still not something someone needs to learn and memorize?


You said a DSL has to have its own syntax, or have to change the language and it implies more cognitive load. That is just not the case, as stated with sources like SICP and Wikipedia.

The whole idea of a DSL is exactly to avoid learning something new. Of course there will be some piece of information to be learned, but what are we comparing against? Is there a solution where somebody does not need to learn absolutely anything? Of course not! You have to learn something, to be able to use it, the question is how to minimize the cognitive load.

You are right it would help some example, I have a couple in which I recently worked on:

1) We had a very complex ASIC which had a complicated way of configuring it: there were RF parameters and also a program that runs in the ASIC; say “repeat 20 times {send, receive, analyze, phase-shift}” of course the real thing is much more complicated. Now the ASIC manufacturer gives an API for doing everything, which involves setting registers, flags, internal state machines, etc. we have an expert that knows lots about RF and the application, but is weak in programming. We did it in lisp, but I will try to explain like if it was C: we made a bunch of functions, lots are very API like, setters and getters. But to program the sequence, we have functions that do flow control. In C looks a little bit awkward, in Lisp is much better. The example above would be: “repeat(20); send(); receive(); analyze (); phase_shift(); iterate();” The guy who writes that “code” does not care about the base language (we had previously never heard about Lisp, he was only able of basic Python). But he was already writing those programs in pseudocode for documentation. So the cognitive load for him is minimal. He has to remember to add “();” at the end of each instruction, and the loops are “repeat(n) … iterate” That’s it! That was much less, than if he had to learn the whole API of the ASIC, he is not a programmer, he is an RF engineer. You may say: it is an API, but look, there was already an API. Makes no sense to do API over API. It was all about transforming the language of the API, to the language of the problem at hand. The API tries to expose every detail of the hardware, in a language which is based on hardware and C, the DS language tries to hide details or translate things into the language of the problem. So the user of the DSL has to learn less.

2) There was an automated planner which lots of rules. Think about it as “1000 ifs, some nested”, originally without DSL, all was hardcoded in C++. We developed based on libconfig (think JSON with C syntax) a little language to express the ifs. Note: there was no new syntax invented, it is the underlying JSON/Libconfig, which are well known syntax. We only made a big “forach” for all elements in the config file, and each passed in a big “case” to dispatch the substructure to the handling function for each instruction. Took 1 day to implement. After that, the intelligence was in separated files, it could be reloaded dynamically, and the people doing the intelligence did not need to be C experts.


a DSL has to have its own syntax

If it's the same language it can't be a new language. You didn't link anything with your sources.

The whole idea of a DSL is exactly to avoid learning something new

But you have to learn the DSL and you have to throw away all your tools. These are two big problems they introduce so the problem they solve better be big and tools/debugging needs to be part of making the DSL. This is why a small DSL is not a good idea.

We had a very complex ASIC which had a complicated way of configuring it: there were RF parameters

This is another side of the story. Passing parameters is data. Inside a program this is a very bad idea because you can already pass around all the data you want any way you want though function calls and memory layouts.

Passing data from one program to another or one computer to another is different, but then that isn't a language, that's a data format like any other file. GCode is a list of 'commands', but fundamentally it is a data format. If you look at the .obj format, it is ascii and needs to be parsed, but not thought of as a language.

Think about it as “1000 ifs, some nested”, originally without DSL, all was hardcoded in C++. We developed based on libconfig (think JSON with C syntax) a little language to express the ifs

This sounds like a data format. If something isn't being executed directly, it's data. If it is being executed directly, don't make a new language, because it takes a decade and hundreds of people to get it to work well.


I would really want to have a face to face conversation, because I see you have genuine interest in the discussion, it seems we are talking past each other.

> If it's the same language it can't be a new language. You didn't link anything with your sources.

A language is more than the syntax. For example common lisp, emacs lisp, racket and scheme are different languages with exact same syntax. Java and C have very similar syntax, but are 2 languages. Source SICP https://web.mit.edu/6.001/6.037/sicp.pdf or the videos in youtube.

A DSL does not need to have a new syntax. Source wikipedia article, under embedded DSL.

If your DSL follows existing syntax, you can use the tools. Note my example with JSON.

>> Passing parameters is data. (…) Passing data from one program to another or one computer to another is different, but then that isn't a language

Well actually it is. And data and code cannot be tell apart. I can only recommend to go throw the SICP lectures in youtube. Your example with GCcode is good, code is data, data is code. Also about the example, consider it is, as said, a great simplification, there are lots of details and constraints that I cannot possibly enumerate here. Also note that one way of passing data between 2 computers can by done via RPC which is a language (procedures and functions are called remotely, executing code in the remote computer, which works with the data) that was actually the case in the example.

> This sounds like a data format. If something isn't being executed directly, it's data. If it is being executed directly, don't make a new language, because it takes a decade and hundreds of people to get it to work well.

A C program is also a data format. All is a data format. At the end in the compiler or interpreter the program is an AST, ALWAYS! And an AST ist just a data structure!


> common lisp, emacs lisp, racket and scheme are different languages with exact same syntax

Far from it. On the s-expression level there are already differences. On the actual language level, Common Lisp for example provides function definitions with named arguments, declarations, documention strings, etc.

For example the syntax for function parameter definition in CL is:

    lambda-list::= (var* 
                    [&optional {var | (var [init-form [supplied-p-parameter]])}*] 
                    [&rest var] 
                    [&key {var |
                           ({var | (keyword-name var)} [init-form [supplied-p-parameter]])}*
                    [&allow-other-keys]] 
                    [&aux {var | (var [init-form])}*]) 

Above is a syntax definition in an EBNF variant used by Common Lisp to describe the syntax of valid forms in the language. There are different operator types and built-in operators and macro operators have especially lots and sometimes complex syntax. See for example the extensive syntax of the LOOP operator in Common Lisp.


Yes, of course I meant the basic S-exp syntax. They are indeed very different languages. The IMHO the biggest differences are scoping, and 1-Lisp and 2-Lisp; which makes different worlds.


all four now use lexical scope. Scheme also supports dynamic scope.

1-lisp or 2-lisp is also a difference, though all support lexical closures and function objects.

Racket now has a variant without s-expressions. That's also a huge difference.


You keep saying there is some mythical "DSL" that isn't actually a new language, no new syntax, works will whatever tools (no word on what language or what tools), not an API, "adds semantic value", but there are no examples after all these comments.

Well actually it is.

This is conflating the term 'language' to mean whatever you want at the moment. There are things that execute and things that don't. These two should be kept as separate as possible, but this is a less that people usually need to learn for themselves after being burned many times by complexity that doesn't need to be there.

And data and code cannot be tell apart. I can only recommend to go throw the SICP lectures in youtube

A C program is also a data

You aren't the first person to be mesmerized by SICP, but if someone gets involved in thinking something is a silver bullet, they will tend to try to find information that validates this belief and reject info that doesn't. This pattern is found elsewhere in life too.

To understand some context, early in the life of LISP and Scheme, there weren't as many scripting languages and people mostly hadn't had a lot of experience with being able to eval tiny programs in their programs. These days that might be used to enable people to write small expressions in a GUI instead of a constant parameter. Many times in programming history people see something new and think it will solve all their problems.

Java went through the same thing. For a long time people though deep inheritance hierarchies would save them until gradually people realized how ridiculous and complicated it made things that could be simple. Inheritance from a base object let people use general data structures and garbage collection + batteries included seemed great, but programmers conflated everything together and thought this terrible aspect of programming was a step forward.

Lisp was very influential, people didn't have scripting languages back then but it isn't a modern way to program.

Data formats are a separate issue and mixing in execution to those is a bad idea too, because the problem they solve is getting data into a program. When you put in execution you no longer know what you're looking at. Instead of being able to see or read directly the data you want, now you need to execute something to see what the values actually are. When you need to execute something you have all sorts of complexity including the need to debug and iterate just to see what was once directly visible.


>You keep saying there is some mythical "DSL" that isn't actually a new language, no new syntax, works will whatever tools (no word on what language or what tools), not an API, "adds semantic value", but there are no examples after all these comments.

I gave you 2 examples, one in lisp, one based on JSON. I said no new syntax, but indeed you have to learn something, if it is a DSL, it is a new language, is on the very name. As long as you make something new, it has to be learned. The point is, if the new thing looks very near the problem domain, an expert in that domain will have no problem in learning it faster than anything else. Again, what are the alternatives?

I do think data and code must no be separated strictly. I do bot like the OOP hype because the reasons you mentioned about Java. BUT: the idea of putting together data and the code in an object I find good in general.

> You aren't the first person to be mesmerized by SICP, but if someone gets involved in thinking something is a silver bullet, they will tend to try to find information that validates this belief and reject info that doesn't. This pattern is found elsewhere in life too.

I do thin SICP is great, and it was a before and after for me. But I do bow found any silver bullet there, quite the opposite, I learned many good ideas, DSLs also, but I use them only when they make sense.

> Java went through the same thing.

My take on java (little off topic) like many other popular languages, started as a bunch of very good ideas, and was victim of its own popularity, it was over hyped, as the solution for all, got bloated, also many subpar programmers started writing tons of it, until the whole ecosystem was totally ruined. Something similar happened with basic, VB, and is happening with Python to certain degree.

> because the problem they solve is getting data into a program. When you put in execution you no longer know what you're looking at. Instead of being able to see or read directly the data you want, now you need to execute something to see what the values actually are. When you need to execute something you have all sorts of complexity including the need to debug and iterate just to see what was once directly visible.

It sounds to me like you got burned by a shitty mixing of code and data, that made your life hard.

> This is conflating the term 'language' to mean whatever you want at the moment. There are things that execute and things that don't.

A language has not to be executable. There are query, configuration, markup languages. A DSL must not be a new scripting language, or even executable. Can be for configuration. And note that is not that I’m stretching the definition by any means: TeX and MD are languages, is overall in the documentation. Also SQL is a language. Maybe we have a different definition of language and there comes all the confusion? Again, I’m 100% that if we meet we would be on the same page in 95% of the topics! :)


It seems like most of what your saying is just conflating an overloaded term of language.

Yes, people use the term language for different things, it doesn't mean they are the same.

Also what you called a language in your first example everyone else would call an API. What you called a language in your second example is just a config file.

It seems that the reality of what you're saying is that you are using 'lots of little languages' because you are calling lots of things languages that no one else does.


If you wrote some functions, it's not DSL, it's functions.

If you calling them in a fancy way with overloads and whatnot, it's not DSL, it's fancy functions.

DSL is domain specific language. It includes domain specific syntax, domain specific semantics and domain specific libraries.


Absolutely no. It may have specific syntax, but is not needed. Where do you have such definition? In fact the typical example is in Lisps, where you add no syntax.

Is not about fancy functions. And not about new syntax. Is about adding semantic value. If somebody adds a collection of functions that allow the expression of solutions to a problem in the very language of the problem, that is a DSL, if the syntax chosen, for whatever reason, e.g. simplicity, happens to be the same as some underlying language, that takes nothing to the fact that it is a DSL.

If you look at the examples of SICP, they are “just” fancy functions. But they are DSLs

An extract of the wikipedia article:

As embedded domain-specific language (eDSL)[4] also known as an internal domain-specific language, is a DSL that is implemented as a library in a "host" programming language. The embedded domain-specific language leverages the syntax, semantics and runtime environment (sequencing, conditionals, iteration, functions, etc.) and adds domain-specific primitives that allow programmers to use the "host" programming language to create programs that generate code in the "target" programming language.


Exactly. Good DSL are typically (although not always) embedded in another. When that is the case, they tend to be a perfect abstraction (if decently implemented)


with static imports in Java I buld DSLs that are basically Lisp style DSLs like

   var f = f1(f2(a,b,f3(c,quote(f4)))
which have a grammar backed by the full faith and credit of the Java type system. You can code gen the static imports.


Eh, you can host DSLs in Kotlin and C# these days, you don’t even have to sell your engineering team on Lisp. The biggest challenge is to explain how an embedded DSL differs from being just a library (interop outside of the eDSL to the host language is still hard).


> (1) DSLs work great sometimes.

I'll take a stab at fleshing this out: DSLs work great when they have an IDE with autocomplete and a quick (or instant) feedback loop.


I was gonna say something like "DSLs work great when they're small, purposeful and easy to test", I guess yours kind of helps when they're not what I'd suggest :)


This whole thread said exactly what I wanted to write. Feels bad to be so pre-empted.

DSLs that solve a specific problem with a page or two of documentation overhead are great.

Trying to reinvent paradigms or scope creep is where the pain comes in. Seems like the post author has been burned by that type of DSLs.


> DSLs that solve a specific problem with a page or two of documentation overhead are great.

Do you have any example? I’ve heard lots of good things of dsl, but never had the luck to witness it’s full glory.

(except for regex, which I love, but it has more than two pages of docs)


I've coded some myself, and have used some... but it depends on where you draw the line.

I'd consider Python's f-string syntax a DSL of sorts.

YAML might be considered a simple DSL, if you don't consider it a language/format instead. It's a bit more than 2-3 pages, but it's not hundreds of pages. And a simplified version could be constructed with <10 pages.

Similar to YAML, but for Markdown. I'd call that a DSL too, and it's even simpler than YAML.

Then, something more tiered as: CSV, JSON, TOML, INI, AsciiDoc

Once you're in the short form, it's a bit blurry what's a format, what's a DSL, and what is a language.

PS. Sorry for the late answer, I missed the direct question for a bit.


I would say that is the line between DSL and just “L” another language…


Maybe his problem is with those yucky distributed systems as in Kube, Antilles, etc. I write plain ordinary Java programs that work with a cloud API and compile bash scripts that colonize machines.


I think of Kube as more of abstractions whereas Wasp & Dark lang would be DSLs for the same concepts


Counterexample: regex. In terms of how successful DSLs are, I think, something like Perl's regular expression is, at least, in the top ten. Most regex users don't care about there being an IDE for it, I don't think there's a lot of value for regex autocomplete, even if such thing existed.


Regex is a good counterexample. It's the only useful DSL I can think of. That said, IDE support for regex would be cool, especially considering many languages have special syntax for regex.


> and a quick (or instant) feedback loop.

And yet Terraform/Tofu continues to poison people's brains. It boggles(!) the mind


So any DSL as an embedded language inside another.


Also k8s is living proof that control loops can and do work. That's like the entire point of its existence.


Counterpoint: k8s is a bad orchestration system with bad scaling properties compared to the state machine versions (Borg, Tupperware/Twine, and presumably others). I say this as someone who has both managed k8s at scale and was a core engineer of one of the proprietary schedulers.


It is, and from experience it is also a good example how control loops are harder than you think. Few people understand it, and it is the source of much underlying trouble.


(3.b) "Being clever rather than over-provisioning" is not generally thought of as a good idea. People would be rather apprehensive if you told them "I'm don't something really clever so we can under- or exactly-provision". I mean, sure, it may indeed work, but that's not the same thing.

(5) Hybrid parallelism - also, many people think it's a bad idea because it makes your software system more complex. Again, it may be very useful sometimes, but it's not like many people would go "yes, that's just what I'm missing right now, let's do parallelism with different hardware and different parts of the workflow and everything will run something kind of different and it'll all work great like a symphony of different instruments".


I think you’re responding to the tweet from Martin that Steven includes at the top, not to Steven’s list.


The tweet has some claims that don't fly.

You don't get the luxury of offline migrations or single-master writes in the high-volume payments space. You simply don't have the option. The money must continually flow.


That's fine. The points made in the tweet are interesting to discuss too.


I've yet to see a DSL work great. Every single time, I'm asking "why isn't this just Python (or some other lang)" especially the times when it's some jacked up variant of Python.


Groovy running on jython


Oh wait, regex is a DSL




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: