<?xml version="1.0" encoding="UTF-8" ?>
<rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wikidot="http://www.wikidot.com/rss-namespace">

	<channel>
		<title>250bpm-blogs</title>
		<link>http://www.250bpm.com/blog</link>
		<description></description>
				<copyright></copyright>
		<lastBuildDate>Sat, 01 Aug 2015 21:39:35 +0000</lastBuildDate>
		
					<item>
				<guid>http://250bpm.com/blog:56</guid>
				<title>Advanced metaprogramming in C</title>
				<link>http://250bpm.com/blog:56</link>
				<description>

&lt;p&gt;[[div style=&amp;quot;width:50em&amp;quot;]]&lt;/p&gt;
&lt;p&gt;by &lt;span class=&quot;printuser avatarhover&quot;&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;&lt;img class=&quot;small&quot; src=&quot;http://www.wikidot.com/avatar.php?userid=939&amp;amp;amp;size=small&amp;amp;amp;timestamp=1438465173&quot; alt=&quot;martin_sustrik&quot; style=&quot;background-image:url(http://www.wikidot.com/userkarma.php?u=939)&quot; /&gt;&lt;/a&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;martin_sustrik&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
</description>
				<pubDate>Sat, 01 Aug 2015 09:00:28 +0000</pubDate>
												<content:encoded>
					<![CDATA[
						 <div style="width:50em"> <p>Metaprogramming in C is the art of creating new language constructs, typically using macros.</p> <p>Great introduction to C metaprogramming by Simon Tatham can be found here:</p> <p><a href="http://www.chiark.greenend.org.uk/~sgtatham/mp/">Metaprogramming custom control structures in C</a></p> <p>Recently, however, I've faced a problem that could not be solved by the means outlined in the article. I was implementing a control structure with variable number of clauses, similar to C's 'switch' statement. To be more concrete, I was trying to implement an equivalent of golang's 'select' statement in <a href="http://libmill.org">libmill</a> project.</p> <p>Here's how the statement looks like in golang:</p> <div class="code"> <pre> <code>select { case channel1 &lt;- 42: foo() case i := &lt;- channel2: bar(i) }</code> </pre></div> <p>The problem here is that the construct is stateful. It has to collect all the conditions first, then wait till one of them becomes true. At the first sight it looks like the syntax like this cannot be even implemented in C. One's natural instinct is to implement it in idiomatic C way by first creating an array of conditions, then doing the poll and finally executing particular piece of code based on the return value from the poll. Something like this:</p> <div class="code"> <pre> <code>int i; struct select_condition conds[2]; conds[0].channel = channel1; conds[0].operation = SELECT_OUT; conds[0].out_value = 42; conds[1].channel = channel2; conds[1].operation = SELECT_IN; conds[1].in_value = &amp;i; switch(do_select(conds, 2)) { case 0: foo(); case 1: bar(i); }</code> </pre></div> <p>However, such syntax is ugly and verbose. You can tolarate it if you are a C programmer, but if you come from a different background you'll find it simply atrocious. Can we make it better in some way?</p> <p>One obvious thougt would be to wrap the initialisation of the pollset into some handy macros:</p> <div class="code"> <pre> <code>#define in(conds, index, chan, val)\ conds[index].channel = chan;\ conds[index].operation = SELECT_IN;\ conds[index].in_value = &amp;val; #define out(conds, index, chan, val)\ conds[index].channel = chan;\ conds[index].operation = SELECT_OUT;\ conds[index].in_value = val;</code> </pre></div> <p>The user's code would then look a bit simpler:</p> <div class="code"> <pre> <code>int i; struct select_condition conds[2]; out(conds, 0, channel1, 42); in(conds, 1, channel2, i); switch(do_select(conds, 2)) { case 0: foo(); case 1: bar(i); }</code> </pre></div> <p>So far, we've done nothing unexpected. But can we go further? Can we somehow not require the user to specify the size of the pollset in advance? It turns out that yes. It can be done using a technique that I am a proud inventor of: We can build a linked list on the stack!</p> <div class="code"> <pre> <code>#define concat(x,y) x##y #define in(conds, chan, val)\ struct select_condition concat(cond,__COUNTER__); concat(cond,__COUNTER__).channel = chan;\ concat(cond,__COUNTER__).operation = SELECT_IN;\ concat(cond,__COUNTER__).in_value = &amp;val;\ concat(cond,__COUNTER__).next = conds;\ conds = &amp;cond; #define out(conds, chan, val)\ struct select_condition concat(cond,__COUNTER__); concat(cond,__COUNTER__).channel = chan;\ concat(cond,__COUNTER__).operation = SELECT_OUT;\ concat(cond,__COUNTER__).out_value = val;\ concat(cond,__COUNTER__).next = conds;\ conds = &amp;cond;</code> </pre></div> <p>And here's the user's code. Note that do_select() is now getting a linked list as an argument instead of an array. Also note that the linked list doesn't have to be deallocated because it lives on the stack and will be freed automatically at the end of the function:</p> <div class="code"> <pre> <code>int i; struct select_condition *conds = NULL; out(conds, channel1, 42); in(conds, channel2, i); switch(do_select(conds)) { case 0: foo(); case 1: bar(i); }</code> </pre></div> <p>At this point we would like to rearrage the code in such a way the actions &#8212; foo() and bar() &#8212; are collocated with the conditions that trigger them rather then having all the conditions listed first, followed by all the actions.</p> <p>Once again, this looks like it may not be possible. Macros are strictly local. They cannot move pieces of code up or down the source file.</p> <p>But that kind of obstacle cannot stop a dedicated C hacker! If we can't move the code, let's do it in a dynamic fashion. We'll introduce a loop with two iterations. First iteration will build the linked list, second one will execute the action:</p> <div class="code"> <pre> <code>int i; int result = -1; struct select_condition *conds = NULL; while(1) { if(result == - 1) { out(conds, channel1, 42); } if(result == 0) { foo(); break; } if(result == - 1) { in(conds, channel2, i); } if(result == 1) { bar(i); break; } result = do_select(conds); }</code> </pre></div> <p>Nice, except that it doesn't work.</p> <p>The problem is that in() and out() macros declare a local variable (select_condition) which gets out of scope when the surrounding if block finishes. Which means that the value can be overwritten at any time. What we need is to get rid of those if blocks. And, yes, we are going to use gotos to accomplish that. Oh boy, this is going to get ugly! But keep calm and follow me!</p> <div class="code"> <pre> <code>int i; int result = -1; struct select_condition *conds = NULL; while(1) { if(result != -1) goto label0; out(conds, channel1, 42); label0: if(result == 0) { foo(); break; } if(result != -1) goto label1; in(conds, channel2, i); label1: if(result == 1) { bar(i); break; } result = do_select(conds); }</code> </pre></div> <p>Now let's stuff more code into in and out macros to make the user's code simple:</p> <div class="code"> <pre> <code>#define select \ int result = -1;\ struct select_condition *conds = NULL;\ while(1) #define in(chan, val) in_((chan), (val), __COUNTER__) #define in_(chan, val, idx)\ if(result != -1)\ goto concat(label, idx);\ struct select_condition concat(cond, idx) =\ {chan, SELECT_IN, &amp;val, 0, conds};\ conds = &amp;cond;\ concat(label, idx);\ if(result == idx) #define out(chan, val) out_((chan), (val), __COUNTER__) #define out_(chan, val, idx)\ if(result != -1)\ goto concat(label, idx);\ struct select_condition concat(cond, idx) =\ {chan, SELECT_OUT, NULL, val, conds};\ conds = &amp;cond; concat(label, idx);\ if(result == idx) #define end \ result = do_select(conds);</code> </pre></div> <p>Note that we had to split the macros into two so that we can reuse the same value of <span style="white-space: pre-wrap;">__COUNTER__</span> in multiple places.</p> <p>And here is how the user code looks like:</p> <div class="code"> <pre> <code>int i; select { out(channel1, 42) { foo(); break; } in(channel2, i) { bar(i); break; } end }</code> </pre></div> <p>There's some more polishing to do. We can get rid of the break statements by hinding them in the following macro. We can declare 'i' varaible in directly in the out() macro. We should put the whole statement into a code block so that variables like 'conds' don't pollute the sorrounding namespace.</p> <p>These final touches are left as an exercise for the reader. (If out of your wits, you can check real-world implementation of the construct <a href="https://github.com/sustrik/libmill/blob/41504d5b7816e4fd152b11d07309c5e9ccfdfb3d/libmill.h#L175">here</a>.)</p> <p>Have fun and happy Swiss National Day everyone!</p> <p><strong>Martin Sústrik, August 1st, 2015</strong></p> <div style="height:2em"></div> </div> <p>by <span class="printuser avatarhover"><a href="http://www.wikidot.com/user:info/martin-sustrik" ><img class="small" src="http://www.wikidot.com/avatar.php?userid=939&amp;amp;size=small&amp;amp;timestamp=1438465173" alt="martin_sustrik" style="background-image:url(http://www.wikidot.com/userkarma.php?u=939)" /></a><a href="http://www.wikidot.com/user:info/martin-sustrik" >martin_sustrik</a></span></p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://250bpm.com/blog:55</guid>
				<title>Let&#039;s stop kidding ourselves about APIs</title>
				<link>http://250bpm.com/blog:55</link>
				<description>

&lt;p&gt;[[div style=&amp;quot;width:50em&amp;quot;]]&lt;/p&gt;
&lt;p&gt;by &lt;span class=&quot;printuser avatarhover&quot;&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;&lt;img class=&quot;small&quot; src=&quot;http://www.wikidot.com/avatar.php?userid=939&amp;amp;amp;size=small&amp;amp;amp;timestamp=1438465173&quot; alt=&quot;martin_sustrik&quot; style=&quot;background-image:url(http://www.wikidot.com/userkarma.php?u=939)&quot; /&gt;&lt;/a&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;martin_sustrik&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
</description>
				<pubDate>Tue, 28 Jul 2015 05:56:50 +0000</pubDate>
												<content:encoded>
					<![CDATA[
						 <div style="width:50em"> <p>I've browsed resources about API design on the web lately and I was rather surprised that I haven't seen a single mention of the most important thing to know about APIs. And by &quot;most important&quot; I don't mean &quot;covers 20% of the topic&quot; but rather &quot;solve it and the rest of it can be done by a monkey&quot; sort of thing.</p> <p>Here it is: API is a POLITICAL concept, not a technical one.</p> <p>Arguing that API is a technical problem is like arguing that state borders are a geographical phenomenon.</p> <p>Yes, boundaries often coincide with rivers and mountain ridges. But often they do not. And what really matters is that they are established by political processes ranging from peaceful negotiation to outright war.</p> <p>Similarly, API (including network protocols and such) only happens when there are different human groups involved in building software.</p> <p>Think about it: APIs are just functions, n'est-ce pas? So how come that you don't consider that little helper function you've written yesterday to be an API? And why are those functions that you share within your team a bit more API-like, but still not real APIs? And why are the functions used to communicate between departments almost definitely APIs and those exposed to the external users are undisputable and full-blown APIs?</p> <p>Because &quot;API&quot;, let's face it, is just a techie way to say &quot;political boundary&quot;. It's used to separate &quot;I am responsible for this&quot; from &quot;you are responsible for that&quot; and &quot;I will get money/credit for this&quot; and &quot;you will get money/respect for that&quot;.</p> <p>But does it really matter? Would the API design process be different if we acknowledged the fact?</p> <p>Yes, it would.</p> <p>Programmers typically don't like politics and tend to argue using technical language. But that doesn't make politics go away, it just makes it subconscious. In fact, I am tired of people pushing their political agenda under the disguise of technical argument. And the fact that they honestly believe that they are being unbiased and strictly technical doesn't make it better. It makes it worse. It also results in messy APIs with ragged borders and undefined &quot;disputed areas&quot; in places where political consensus wasn't reached.</p> <p>Can we make it better?</p> <p>Yes. But we first have to acknowledge that what's really going on is a dispute about power, money and respect. Only then we can settle that dispute first and move to technical stuff later. Actually, once the political dispute is over, technical design should be fairly trivial.</p> <p>Acknowledging the political nature of APIs also provides insight into the dynamics of the design process. Commercial entities typically want to grab more land. Open source developers working for free want their area to be somehow smaller to avoid too much work. In general, everybody wants to get rid of high cost/low return areas and exercise control over low cost/high return areas. People who are actually working with the technology want the boundaries to be as short and well-defined as possible to minimise friction between the groups. People who do sales or politics for living, on the other hand, want the borders to be long, ambiguously defined and convoluted. They thrive on friction and so they create it.</p> <p>I know that it's hard to get rid of the stereotype of API being a technical matter and to train yourself not to jump to the technical argument before the political issues are solved. However, if you do so, you'll save yourself many a headache and you'll also make the world a better place for the rest of us.</p> <p><strong>Martin Sústrik, July 28th, 2015</strong></p> <div style="height:2em"></div> </div> <p>by <span class="printuser avatarhover"><a href="http://www.wikidot.com/user:info/martin-sustrik" ><img class="small" src="http://www.wikidot.com/avatar.php?userid=939&amp;amp;size=small&amp;amp;timestamp=1438465174" alt="martin_sustrik" style="background-image:url(http://www.wikidot.com/userkarma.php?u=939)" /></a><a href="http://www.wikidot.com/user:info/martin-sustrik" >martin_sustrik</a></span></p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://250bpm.com/blog:54</guid>
				<title>The Second Use Case for Literate Programming</title>
				<link>http://250bpm.com/blog:54</link>
				<description>

&lt;p&gt;[[div style=&amp;quot;width:50em&amp;quot;]]&lt;/p&gt;
&lt;p&gt;by &lt;span class=&quot;printuser avatarhover&quot;&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;&lt;img class=&quot;small&quot; src=&quot;http://www.wikidot.com/avatar.php?userid=939&amp;amp;amp;size=small&amp;amp;amp;timestamp=1438465174&quot; alt=&quot;martin_sustrik&quot; style=&quot;background-image:url(http://www.wikidot.com/userkarma.php?u=939)&quot; /&gt;&lt;/a&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;martin_sustrik&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
</description>
				<pubDate>Mon, 20 Jul 2015 21:08:49 +0000</pubDate>
												<content:encoded>
					<![CDATA[
						 <div style="width:50em"> <p>Everyone have heard that Donald Knuth have invented something called &quot;literate programming&quot;. Everybody thinks that it's something like commenting your code very heavily, but maybe not &#8212; maybe it's different in some way. <a href="https://en.wikipedia.org/wiki/Literate_programming">Wikipedia</a> isn't of much help. Nobody is quite sure.</p> <p>To understand what's going on, you have to remember what Mr. Knuth does for living: He's an academic. He writes papers and textbooks.</p> <p>And how does a paper about computer science look like? It's nicely formated text, with chapters and figures and equations and snippets of code.</p> <p>One writes such papers in TeX.</p> <p>But there's a problem: How would you know that the program you are presenting would actually compile and run? It's a paper after all, you can't just take it and compile it.</p> <p>Enter literate programming.</p> <p>You write the document in a language that makes it easy to extract the snippets of code from the paper and reassemble them to form a full-blown program.</p> <p>That's why there are two seemingly unrelated aspects of literate programming. First, there's a way to distinguish the document (TeX source) and the code (C, for example). Second, there's a simple template language to specify how do the individual snippets of code fit together.</p> <p>Easy. Almost trivial. And yet unexpected and beautiful in its own way.</p> <p>And it's also obvious why the literate programming haven't got much traction: Only minority of programmers works on complex algorithms and publishes papers and textbooks about them. Most programmers write CRUD applications. Or whatever is the latest and shiniest counterpart of CRUD nowadays. When writing CRUD, there's no need for extensive explanation of what the code does. It does CRUD. As simple as that.</p> <p>Actually, there's an influential train of thought among programmers which argues that extensive documentation in the code is necessarily going to bitrot and turn into a maintainability problem. Programmers with such disposition try to keep the comments in code at minimal level and focus on writing self-documenting code (using descriptive variable names and such).</p> <p>In any case, I've recently realised that there's a different niche outside of academia where literate programming would make sense.</p> <p>It's the niche of complex processes which are not very exact or well understood, processes which change often but are executed rarely.</p> <p>Consider, for example, a program that does yearly report for a big IT department.</p> <p>IT changes a lot. The program is run on Jan 1st and after a year, when it is supposed to be run again, it no longer works. All the systems it interfaced with have changed. The API is different and the program doesn't even compile. It references database tables that were either heavily refactored or don't even exist anymore. The tools it used to perform its task no longer work. Licenses may have expired or the vendor have simply stopped supporting them. And so on.</p> <p>In short, it can be said that the program is functional only once a year, on January 1st. The rest of the year it is broken.</p> <p>Or you can put it in a different way: You can say that program execution and debugging blend to such extent that it's hard to tell the difference between the two.</p> <p>How would you go about writing such a program?</p> <p>Writing neat and perfectly polished code doesn't make sense. The program will bitrot almost immediately after it is exectuted anyway. What makes sense though, is documententing the <strong>intent</strong> of the code in a great detail. The programmer tasked with running the program next year will not make much sense of the broken code. References to APIs and tables that don't exist anymore will confuse him more than help him. However, he will greatly benefit from the lengthy description of what the code is <strong>intended</strong> to do.</p> <p>That's why literate programming may help in this case.</p> <p>However, I don't believe that tools Mr. Knuth and his associates have written would be of much help here. The use case is sufficiently different that, although it still can be recognised as literate programming, it needs either extended or completely different tools. I can't imagine engineer documenting the yearly balance program in TeX. Markdown seems to be a much more realistic option. Also, tying the tool to a single programming language probably won't work. Complex processes in most cases need to use a large number of disparate technologies and even pass the execution to the human being at times (e.g. &quot;press the power button&quot;) and we have to account for that.</p> <p>I am still thinking about how to solve this problem. It's extremely interesting because it requires redefining the very interface between the machine and the programmer. Instead of programmer writing the program and machine executing it afterwards we now have them participating on a single continuous process, programmer providing his ability to deal with the unexpected, machine supplying the raw computational power. Such systems are not unheard of (Prolog, anyone?) but they've rarely went beyond being an academic exercise. And maybe it's time to change that.</p> <p><strong>Martin Sústrik, July 20th, 2015</strong></p> <div style="height:2em"></div> </div> <p>by <span class="printuser avatarhover"><a href="http://www.wikidot.com/user:info/martin-sustrik" ><img class="small" src="http://www.wikidot.com/avatar.php?userid=939&amp;amp;size=small&amp;amp;timestamp=1438465174" alt="martin_sustrik" style="background-image:url(http://www.wikidot.com/userkarma.php?u=939)" /></a><a href="http://www.wikidot.com/user:info/martin-sustrik" >martin_sustrik</a></span></p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://250bpm.com/blog:53</guid>
				<title>A case for unstructured programming</title>
				<link>http://250bpm.com/blog:53</link>
				<description>

&lt;p&gt;[[div style=&amp;quot;width:50em&amp;quot;]]&lt;/p&gt;
&lt;p&gt;by &lt;span class=&quot;printuser avatarhover&quot;&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;&lt;img class=&quot;small&quot; src=&quot;http://www.wikidot.com/avatar.php?userid=939&amp;amp;amp;size=small&amp;amp;amp;timestamp=1438465174&quot; alt=&quot;martin_sustrik&quot; style=&quot;background-image:url(http://www.wikidot.com/userkarma.php?u=939)&quot; /&gt;&lt;/a&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;martin_sustrik&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
</description>
				<pubDate>Sat, 18 Jul 2015 06:06:19 +0000</pubDate>
												<content:encoded>
					<![CDATA[
						 <div style="width:50em"> <p>Let me say few words about unstructured programming.</p> <p>First of all, almost nobody can do it any more. You don't believe me? Just try writing you next project in unstructured way, Instead of:</p> <div class="code"> <pre> <code>if(x) { do_stuff1(); do_stuff2(); }</code> </pre></div> <p>write:</p> <div class="code"> <pre> <code>if(!x) goto stuffed; do_stuff1(); do_stuff2(); stuffed:</code> </pre></div> <p>Very soon you'll feel that creeping sense of insecurity: Am I doing it right? If this thing going to be maintainable? And what are the best practices anyway?</p> <p>To be fair, the above doesn't apply to programmers who are fluent in assembly, but everyone else is encouraged to actually do this as an exercise. If nothing else, it will broaden your mind. For extra fun have a look at the <a href="https://en.wikipedia.org/wiki/Duff%27s_device">Duff's device</a> before and after the exercise. What have seemed like a terrible hack before will turn into a reasonable solution, even to a thing of beauty.</p> <p>Anyway, the real point I wanted to make today is that the programming languages called &quot;structured&quot; still have a lot of unstructured traits. And it's not because their authors sucked at programming language design and let the old, discredited constructs leak into modern languages, but because fully structured language is, for all practical purposes, unusable.</p> <p>To be clear, structured language is a language where the control flow follows the syntactic structure of the code, where there are no random cross-jumps from one branch of the parse tree to another.</p> <p>And yes, this time it's not the vilified goto who's the main culprit, but rather the humble and innocent 'return', along with 'break' and 'continue'.</p> <p>Consider this program:</p> <div class="code"> <pre> <code>while(1) { do_stuff1(); if(condition()) break; do_stuff2(); }</code> </pre></div> <p>And here's its fully structured counterpart:</p> <div class="code"> <pre> <code>int done = 0; while(!done) { do_stuff1(); if(condition()) done = 1; if(!done) { do_stuff2(); } }</code> </pre></div> <p>And while the latter program may be only slightly ugly, trying to return from inside three nested loops in a structured way turns into an actual coding horror.</p> <p>You may have already observed that the reason why unstructured traits aren't still showing any signs of demise from the modern programming languages is readability. Dijkstra argued that unstructured programming hurts readability (and thus maintainability) but now it turns out that even the structured way, if applied over-zealously, will do the same thing.</p> <p>This argument looks like a straw man. In the end, nobody sane would write the fully structured program above. So what am I ranting about?</p> <p>Well, it turns out that some programmers don't take full advantage of the unstructured traits in their language, resulting in some pretty ugly code:</p> <div class="code"> <pre> <code>void foo(void) { if(condition1()) { do_stuff1(); } else { if(condition2()) { do_stuff2(); } else { if(condition3()) { do_stuff3(); } else { do_stuff4(); } } } }</code> </pre></div> <p>Why not write it like this:</p> <div class="code"> <pre> <code>void foo(void) { if(condition1()) { do_stuff1(); return; } if(condition2()) { do_stuff2(); return; } if(condition3()) { do_stuff3(); return; } do_stuff4(); }</code> </pre></div> <p>It's good to understand why the latter example is more readable than the former one: It is not only a matter of the fact that large number of nested blocks tend to push the code out of the right side of the screen. More importantly, reading nested blocks means you have to maintain more context in your mind. If there are 5 nested blocks and you are reading the innermost one you are effectively keeping a 5-element parse stack in your mind, for the single purpose of being able to continue reading when you enconter the next 'else' statement.</p> <p>Here's my advice: When creating a new nested block, spend a second considering whether the same thing cannot be achieved by simply by using 'return', 'break' or 'continue'. It may require moving some bits around but the resulting flat and readable code is definitely worth the hassle.</p> <p><strong>Martin Sústrik, July 18th, 2015</strong></p> <div style="height:2em"></div> </div> <p>by <span class="printuser avatarhover"><a href="http://www.wikidot.com/user:info/martin-sustrik" ><img class="small" src="http://www.wikidot.com/avatar.php?userid=939&amp;amp;size=small&amp;amp;timestamp=1438465174" alt="martin_sustrik" style="background-image:url(http://www.wikidot.com/userkarma.php?u=939)" /></a><a href="http://www.wikidot.com/user:info/martin-sustrik" >martin_sustrik</a></span></p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://250bpm.com/blog:52</guid>
				<title>Where are Python macros?</title>
				<link>http://250bpm.com/blog:52</link>
				<description>

&lt;p&gt;[[div style=&amp;quot;width:50em&amp;quot;]]&lt;/p&gt;
&lt;p&gt;by &lt;span class=&quot;printuser avatarhover&quot;&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;&lt;img class=&quot;small&quot; src=&quot;http://www.wikidot.com/avatar.php?userid=939&amp;amp;amp;size=small&amp;amp;amp;timestamp=1438465174&quot; alt=&quot;martin_sustrik&quot; style=&quot;background-image:url(http://www.wikidot.com/userkarma.php?u=939)&quot; /&gt;&lt;/a&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;martin_sustrik&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
</description>
				<pubDate>Sun, 05 Jul 2015 10:24:44 +0000</pubDate>
												<content:encoded>
					<![CDATA[
						 <div style="width:50em"> <p>Everyone believes that macros (as in C preprocessor and assembly macros) are the Bad Thing and no modern language with any self-respect has them.</p> <p>Yes, UNIX philosophy explicitly says &quot;Avoid hand-hacking; write programs to write programs when you can&quot; but without a decent macro language, code generation has to be done by an external tool, which makes it obscure and unreadable for anyone beyond the author of the code. (We've actually seen this. OpenAMQ project was havily code generated which then prooved to be almost an inpenetrable barrier for the contributors.)</p> <p>I think I know what happened: Everyone was scared away be the sheer ugliness and inadequateness of C preprocessor and nobody even tried to do the same thing again and better.</p> <p>Here are some deficiencies of C macros system, just off the top of my head:</p> <ul> <li>Why does it have non-C syntax?</li> <li>Is it even Turing complete?</li> <li>And even if it is, why the hell do I have to define 3 nested macros to do such a basic operation as string concatenation?</li> <li>Why are such elementary primitives as <span style="white-space: pre-wrap;">__COUNTER__</span> not in the standard?</li> </ul> <p>But it's too late to fix that.</p> <p>What about newer languages though?</p> <p>Perl: Nothing. Python: No macros. Ruby: Nada.</p> <p>If you are thinking of implementing a new programming language that will take over the world, give humble old macros a chance. Just take care to implement it in a sane way. After all, macro is just a function that returns a string that then gets inserted into the source code.</p> <p><strong>Martin Sústrik, July 5th, 2015</strong></p> <div style="height:2em"></div> </div> <p>by <span class="printuser avatarhover"><a href="http://www.wikidot.com/user:info/martin-sustrik" ><img class="small" src="http://www.wikidot.com/avatar.php?userid=939&amp;amp;size=small&amp;amp;timestamp=1438465174" alt="martin_sustrik" style="background-image:url(http://www.wikidot.com/userkarma.php?u=939)" /></a><a href="http://www.wikidot.com/user:info/martin-sustrik" >martin_sustrik</a></span></p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://250bpm.com/blog:51</guid>
				<title>Dissecting Software Component&#039;s Reproductive System</title>
				<link>http://250bpm.com/blog:51</link>
				<description>

&lt;p&gt;[[div style=&amp;quot;width:50em&amp;quot;]]&lt;/p&gt;
&lt;p&gt;by &lt;span class=&quot;printuser avatarhover&quot;&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;&lt;img class=&quot;small&quot; src=&quot;http://www.wikidot.com/avatar.php?userid=939&amp;amp;amp;size=small&amp;amp;amp;timestamp=1438465174&quot; alt=&quot;martin_sustrik&quot; style=&quot;background-image:url(http://www.wikidot.com/userkarma.php?u=939)&quot; /&gt;&lt;/a&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;martin_sustrik&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
</description>
				<pubDate>Thu, 25 Jun 2015 07:55:30 +0000</pubDate>
												<content:encoded>
					<![CDATA[
						 <div style="width:50em"> <p>After writing <a href="http://250bpm.com/blog:49">&quot;reusability trap&quot;</a> blog post where I played with evolutionary concepts in the area of sofware engineering I was pointed to the <a href="http://www.laputan.org/selfish/selfish.html">&quot;Selfish Class&quot;</a> essay, which explores similar landscape.</p> <p>The article is fine although a bit naive. It makes it sound like programmers were in control of the evolution of software. Whereas, in fact, they are just a useful source of random mutation in the self-driven evolutionary process. That is, I guess, both bad news (The horror! We are not in charge!) and good news (Hooray! We are not responsible for the mess after all!)</p> <p>In this article I am not going to waste time on the mechanism of software evolution, but rather look at what have evolved and how it helps software components to survive in today's dog-eat-dog software jungle.</p> <p>Many components are quite sophisticated in their survival strategies. They survive and reproduce by being useful to the users. They spread by being sneaky and by tricking users to click on .exe attachments. Some of them thrive in symbiosis with sales people. Some don't care about users at all but are good at being preinstalled on hardware. Free and open software is heavily <a href="https://en.wikipedia.org/wiki/R/K_selection_theory">r-selected</a> and spreads by being laughably cheap to use. Some components keep user's data hostage to avoid being deleted. Some are well-adapted to corporate purchasing process. Other exploit need for interoperability as their primary reproduction mechanism. Yet others rely on standardisation or governmental regulation. And the list goes on and on.</p> <p>Let's now proceed to a concrete case study. It's anonymised and heavily simplified but in its essence based on components observed in the wild.</p> <p>What you see below is a dissected API of the component. It consists of two functionally distinct parts labeled &quot;bait&quot; and &quot;hook&quot;:</p> <div class="image-container aligncenter"><a href="http://250bpm.wdfiles.com/local--files/blog:51/evolution1.jpeg"><img src="http://250bpm.wdfiles.com/local--resized-images/blog:51/evolution1.jpeg/medium.jpg" alt="evolution1.jpeg" class="image" /></a></div> <p>The &quot;bait&quot; is the part that lures the prey. It attracts programmers same way as anglerfish attracts a shrimp. It does so by being useful. C language has no stadard containers so, yeah, it is an attractive proposition. You won't have to implement a queue for the umpteenth time. The implementation is efficient and the API is clean. The &quot;listen&quot; function is a bit weird but we are not going to use it anyway, so who cares?</p> <p>At this point the library have successfully infected a project. Now it faces a different challenge: To survive the long and turbulent life of the project without being removed or replaced by a different library.</p> <p>And, as I've <a href="http://250bpm.com/blog:49">argued</a> in the past, if the component was well-behaved, i.e. if it only had the &quot;bait&quot; organ, its chance of surviving in a long-lived project would be slight indeed.</p> <p>This is where the &quot;hook&quot; becomes useful. The &quot;listen&quot; function allows user to register a callback which will then be used to deliver messages from the queue in asynchronous manner. It sits in the codebase, unused, waiting for an useful idiot to activate it. It may be an intern or a junior developer who stumbles over the function, finds is marginally useful and calls it.</p> <p>At that point the hook penetrates the meat of the project and gets entangled its most critical entrails. Tight coupling is established between the project and the component. The component is now driving project's control flow, becomes a scheduler of sort, and cannot be removed without putting the business logic at risk.</p> <p>Imagine yourself trying to rip it out: Can two callbacks be executed in parallel? Does the business logic have to be thread-safe? Are there multiple listeners on a single queue? How are the messages dispatched among them? Is it a simple round-robin? Randomised? Some other mechanism? Does correctness of the business logic depend on a particular order of execution? And even if the tests pass, can we be sure that the business logic wasn't slightly broken? And so on.</p> <p>However, keep in mind that we've dissected only a part of the component's body. In reality, it is likely to have additional organs like, for example, a &quot;first one is free&quot; marketing strategy where license fees are extracted only after the hook is firmly attached and a circulatory system that brings the money back to sales &amp; marketing which in turn boost its propagation rate.</p> <p><strong>Martin Sústrik, June 25th, 2015</strong></p> <div style="height:2em"></div> </div> <p>by <span class="printuser avatarhover"><a href="http://www.wikidot.com/user:info/martin-sustrik" ><img class="small" src="http://www.wikidot.com/avatar.php?userid=939&amp;amp;size=small&amp;amp;timestamp=1438465174" alt="martin_sustrik" style="background-image:url(http://www.wikidot.com/userkarma.php?u=939)" /></a><a href="http://www.wikidot.com/user:info/martin-sustrik" >martin_sustrik</a></span></p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://250bpm.com/blog:50</guid>
				<title>Finish your stuff</title>
				<link>http://250bpm.com/blog:50</link>
				<description>

&lt;p&gt;[[div style=&amp;quot;width:50em&amp;quot;]]&lt;/p&gt;
&lt;p&gt;by &lt;span class=&quot;printuser avatarhover&quot;&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;&lt;img class=&quot;small&quot; src=&quot;http://www.wikidot.com/avatar.php?userid=939&amp;amp;amp;size=small&amp;amp;amp;timestamp=1438465174&quot; alt=&quot;martin_sustrik&quot; style=&quot;background-image:url(http://www.wikidot.com/userkarma.php?u=939)&quot; /&gt;&lt;/a&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;martin_sustrik&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
</description>
				<pubDate>Tue, 09 Jun 2015 14:58:55 +0000</pubDate>
												<content:encoded>
					<![CDATA[
						 <div style="width:50em"> <p>If there is one principle that should be added to the UNIX philosophy, it is:</p> <p>&quot;Finish your project.&quot;</p> <p>It's the most simple, yet the most disregarded software engineering princinple I can think of.</p> <p>I dare you to list three finished software projects.</p> <p>Having hard time, eh?</p> <p>Except for some basic UNIX tools, like grep or make, it's almost impossible to find anyting that's truly finished, not simply abandoned.</p> <p>Imagine the carpenters were like programmers. You bought a chair. You bought it because you've inspected it and found out that it fulfills all your needs. Then, every other day, the carpenter turns up at your place and makes a modification to the chair. Some of the changes may be useful, some neutral, some are simply annoying and some, like those spikes protruding from the wood, make the chair no longer usable. But irrespective of that: You bought a damned chair and you want it to remain a chair, not to find out that it's some kind of protean piece of furniture that's a chair today and partly a table tomorrow and, who knows, maybe you'll be able to humidify your cigars in it next week. So you refuse to let the carpenter in just to find out next morning that the chair have sneaked out through the back door during the night and had nice shiny dental drill added. At this point there's no recourse but to burn the chair down, put its remanants into the cellar and pour some quality concrete over it to make sure it stays dead forever. But just before the movie ends, there's a shot showing a crack in the concrete making it clear that new, monstrous &quot;Chair, version 2&quot; sequel is coming. Soon.</p> <p>In short, do finish your stuff.</p> <p>Of course, it will need maintenance later on, bug fixes and security patches, but make it functionally complete in the first place.</p> <p>Yes, I hear you claiming that your project is special and cannot be made functionally complete because, you know, circumstances X, Y and Z.</p> <p>And here's the trick: If it can't be made functionally complete it's too big. You've tried to bite off a piece that's bigger than you can swallow. Just get back to the drawing board and split off a piece that can be made functionally complete. And that's the component you want to implement.</p> <p>Here's my own experience:</p> <p>First open source project I've worked on was <a href="https://www.amqp.org/">AMQP</a> messaging protocol and its reference implementation. It was impossible to make it functionally complete because it was designed by committee and it kept growing all the time. If it still lives its long and tortured life it's probably able to sing and dance and provide light entertainment for the entire family by now.</p> <p>Out of the frustration with AMQP I've started my own <a href="http://zero.mq">ZeroMQ</a> project. No committee this time! Surely, I was able to make it functionally complete? Well, no. I've tried to make it do too much. It is a compatibility library, a async I/O framework, a message delimitation protocol and a library of messaging patterns. Today, almost eight years after its inception, the project is still under active development and heading towards the point where it will be able to <a href="http://www.catb.org/jargon/html/Z/Zawinskis-Law.html">send email</a>.</p> <p>Next one: <a href="http://nanomsg.org">nanomsg</a>. An alternative to ZeroMQ. I've tried to limit the scope of project by splitting the transport functionality as well as messaging patterns into separate plug-ins. I was partially successful, but it is still too big a chunk to swallow in a single piece.</p> <p>Lately, after 30 years in the programming business, I've finally managed to cut down my projects to a reasonable size.</p> <p><a href="http://ribosome.ch">Ribosome</a> is a code generator. It generates code and that's it. It's functionally complete. Since I wrote it I've fixed some bugs and it was translated to a couple of alternative programming languages (thanks to Apostolis Xekoukoulotakis and Ali Zaidi!). But that's it. No functionality was added. And, I assure you, there's no plan to do so in the future.</p> <p><a href="http://libmill.org">Libmill</a> is a library that introduces Go-style concurrency to C. After couple of months of development it seems to functionally complete now. In the future it may be ported to different CPU architectures and opertating systems but there will be no functionality change. There will be bugfixes but no feature creep.</p> <p>So, it seems, I am finally there.</p> <p>Please join me in my effort and do finish your projects. Your users will love you for that.</p> <p><strong>Martin Sústrik, June 9th, 2015</strong></p> <div style="height:2em"></div> </div> <p>by <span class="printuser avatarhover"><a href="http://www.wikidot.com/user:info/martin-sustrik" ><img class="small" src="http://www.wikidot.com/avatar.php?userid=939&amp;amp;size=small&amp;amp;timestamp=1438465174" alt="martin_sustrik" style="background-image:url(http://www.wikidot.com/userkarma.php?u=939)" /></a><a href="http://www.wikidot.com/user:info/martin-sustrik" >martin_sustrik</a></span></p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://250bpm.com/blog:49</guid>
				<title>Reusability Trap</title>
				<link>http://250bpm.com/blog:49</link>
				<description>

&lt;p&gt;[[div style=&amp;quot;width:50em&amp;quot;]]&lt;/p&gt;
&lt;p&gt;by &lt;span class=&quot;printuser avatarhover&quot;&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;&lt;img class=&quot;small&quot; src=&quot;http://www.wikidot.com/avatar.php?userid=939&amp;amp;amp;size=small&amp;amp;amp;timestamp=1438465174&quot; alt=&quot;martin_sustrik&quot; style=&quot;background-image:url(http://www.wikidot.com/userkarma.php?u=939)&quot; /&gt;&lt;/a&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;martin_sustrik&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
</description>
				<pubDate>Wed, 03 Jun 2015 23:34:13 +0000</pubDate>
												<content:encoded>
					<![CDATA[
						 <div style="width:50em"> <p>Truly reusable components are clearly separated from the code that uses them. That makes it easy to use the component in a different project. It also makes it easy to rip the component out and replace it by a different one.</p> <p>Non-reusable components cannot be used in different projects. At the same time, they tend to be hairy and almost impossible to rip out of their native project. They are interconnected with the rest of the codebase in subtle ways. Replacing them can break the functionality in unexpected manners and even if it does not you can never be sure that there's no hidden problem waiting for the most inconvenient time to strike you.</p> <p>In a long-lived project, components are being replaced. Nice reusable components are easy to replace and so they are. Ugly non-reusable components are pain to replace and each replacement means both a considerable risk and considerable cost. Thus, more often then not, they are not replaced. As the years go by, reusable components pass away and only the hairy ones remain. In the end the project turns into a monolithic cluster of ugly components melted one into another.</p> <p>This mechanism seems to account for most of the complexity in legacy codebases.</p> <p>Let's call it Sustrik's law:</p> <p><strong>&quot;Well-designed components are easy to replace. Eventually, they will be replaced by ones that are not so easy to replace.&quot;</strong></p> <p>The law seems almost impossible to beat. Are there any policies users can adopt to avoid the trap? Is there anything developers of nice components can do to avoid being replaced? I would love to hear any suggestions.</p> <p><strong>Martin Sústrik, June 2nd, 2015</strong></p> <div style="height:2em"></div> </div> <p>by <span class="printuser avatarhover"><a href="http://www.wikidot.com/user:info/martin-sustrik" ><img class="small" src="http://www.wikidot.com/avatar.php?userid=939&amp;amp;size=small&amp;amp;timestamp=1438465174" alt="martin_sustrik" style="background-image:url(http://www.wikidot.com/userkarma.php?u=939)" /></a><a href="http://www.wikidot.com/user:info/martin-sustrik" >martin_sustrik</a></span></p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://250bpm.com/blog:48</guid>
				<title>Coroutines in C with Arbitrary Arguments</title>
				<link>http://250bpm.com/blog:48</link>
				<description>

&lt;p&gt;[[div style=&amp;quot;width:50em&amp;quot;]]&lt;/p&gt;
&lt;p&gt;by &lt;span class=&quot;printuser avatarhover&quot;&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;&lt;img class=&quot;small&quot; src=&quot;http://www.wikidot.com/avatar.php?userid=939&amp;amp;amp;size=small&amp;amp;amp;timestamp=1438465174&quot; alt=&quot;martin_sustrik&quot; style=&quot;background-image:url(http://www.wikidot.com/userkarma.php?u=939)&quot; /&gt;&lt;/a&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;martin_sustrik&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
</description>
				<pubDate>Sun, 19 Apr 2015 02:55:14 +0000</pubDate>
												<content:encoded>
					<![CDATA[
						 <div style="width:50em"> <p>There are many C coroutine implementations out there.</p> <p>However, they usually allow only a single predefined type of function to be exected as a coroutine. For example, both <em>libtask</em> and <em>libcoro</em> require that the coroutine has the following prototype:</p> <div class="code"> <pre> <code>void foo(void *arg);</code> </pre></div> <p>The following toy implementation of coroutines shows how to execute an arbitrary function as a coroutine. The idea is that if the coroutine invocation performs the context switch to the new coroutine straight away, it can use C compiler to put the arguments on the stack and the library doesn't have to handle it itself. Of course, given that coroutines are executed concurrently, any non-void return value is lost.</p> <div class="code"> <pre> <code>#define STACK_SIZE 16384 volatile int unoptimisable_ = 1; struct cr_ { struct cr_ *next; jmp_buf ctx; }; struct cr_ main_cr_ = {NULL}; struct cr_ *first_cr_ = &amp;main_cr_; struct cr_ *last_cr_ = &amp;main_cr_; #define go(fn) \ do {\ if(!setjmp(first_cr_-&gt;ctx)) {\ char *stack = malloc(STACK_SIZE);\ int anchor_[unoptimisable_];\ char filler_[(char*)&amp;anchor_ - (char*)(stack + STACK_SIZE)];\ struct cr_ cr[unoptimisable_];\ cr-&gt;next = first_cr_;\ first_cr_ = cr;\ char *stack_[unoptimisable_];\ stack_[0] = stack;\ fn;\ free(stack_[0]);\ first_cr_ = first_cr_-&gt;next;\ longjmp(first_cr_-&gt;ctx, 1);\ }\ } while(0) void yield(void) { if(first_cr_ == last_cr_) return; if(setjmp(first_cr_-&gt;ctx)) return; struct cr_ *cr = first_cr_; first_cr_ = cr-&gt;next; cr-&gt;next = NULL; last_cr_-&gt;next = cr; last_cr_ = cr; longjmp(first_cr_-&gt;ctx, 1); }</code> </pre></div> <p>And here's a piece of code that uses it. Note how printf &#8212; a system function with variable argument list &#8212; is invoked like a coroutine:</p> <div class="code"> <pre> <code>void foo(int count, const char *text) { int i; for(i = 0; i != count; ++i) { printf(&quot;%s\n&quot;, text); yield(); } } int main() { go(foo(3, &quot;a&quot;)); go(printf(&quot;Hello, %s!\n&quot;, &quot;world&quot;)); go(foo(2, &quot;b&quot;)); foo(5, &quot;c&quot;); return 0; }</code> </pre></div> <p><strong>Martin Sústrik, April 19th, 2015</strong></p> <div style="height:2em"></div> </div> <p>by <span class="printuser avatarhover"><a href="http://www.wikidot.com/user:info/martin-sustrik" ><img class="small" src="http://www.wikidot.com/avatar.php?userid=939&amp;amp;size=small&amp;amp;timestamp=1438465174" alt="martin_sustrik" style="background-image:url(http://www.wikidot.com/userkarma.php?u=939)" /></a><a href="http://www.wikidot.com/user:info/martin-sustrik" >martin_sustrik</a></span></p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://250bpm.com/blog:47</guid>
				<title>A cryptopuzzle. Ready, steady, go!</title>
				<link>http://250bpm.com/blog:47</link>
				<description>

&lt;p&gt;[[div style=&amp;quot;width:50em&amp;quot;]]&lt;/p&gt;
&lt;p&gt;by &lt;span class=&quot;printuser avatarhover&quot;&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;&lt;img class=&quot;small&quot; src=&quot;http://www.wikidot.com/avatar.php?userid=939&amp;amp;amp;size=small&amp;amp;amp;timestamp=1438465174&quot; alt=&quot;martin_sustrik&quot; style=&quot;background-image:url(http://www.wikidot.com/userkarma.php?u=939)&quot; /&gt;&lt;/a&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;martin_sustrik&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
</description>
				<pubDate>Fri, 16 Jan 2015 08:15:28 +0000</pubDate>
												<content:encoded>
					<![CDATA[
						 <div style="width:50em"> <p>I've solved a nice cryptopuzzle yesterday and solving it was so much fun that it would be a shame not to share it.</p> <p>The background is simple: Elections.</p> <p>There's a lot of clever crypto being devised to help with elections (see e.g. <a href="http://en.wikipedia.org/wiki/End-to-end_auditable_voting_system">E2E audiatable voting systems</a> , also nice talk <a href="https://www.youtube.com/watch?v=ZDnShu5V99s">here</a>), but, as you might notice, there's always an assumption that the voter will come to the polling place to cast the ballot.</p> <p>Half a minute or so of the enforced privacy in the booth is necessary for the voter to do their decision without being coerced. The fact that there's no way, after the fact, for the voter to prove how they have voted makes coercion impossible. And for exactly the same reason you can't sell your vote: You can't prove how you have actually voted and thus there's nothing to sell.</p> <p>What about the online elections though?</p> <p>There are no voting booths, no enforced privacy. Therefore, the voters can be coerced to vote for a specific candidate. They may sell their votes. And that's bad news for democracy, of course.</p> <p>So, let's have a look at Estonian e-voting system (see <a href="http://en.wikipedia.org/wiki/Electronic_voting_in_Estonia">here</a>, and <a href="https://www.youtube.com/watch?v=JY_pHvhE4os">here</a>) to see how do they deal with the problem. It turns out that they allow to cast multiple e-votes, with only the last one being valid. Therefore, if you were coerced to vote against your will, you can still cast another vote later and thus, in essence, overwrite your previous coerced vote.</p> <p>It's not a perfect system. Given that elections end at a specified time, say Saturday at 12:00PM, the coercer can arrive at your place at 11:55PM, force you to cast a vote and wait for 5 minutes to make sure you haven't cast another vote.</p> <p>As far as I can see, Estonians solve the issue by doing e-voting first, physical voting second. That way, you can still visit a voting booth after the end of e-voting period and cast the vote in person. Physical vote beats the e-vote.</p> <p>There goes the prospect of fully online elections! Estonian system requires physical polling places. Even worse, the coercer can follow the voter during the whole voting period and beat the living deaylights out of them in case they try to vote at a polling place.</p> <p>So, after a rather lengthy introduction, let's have a look at the puzzle:</p> <p><strong>Imagine fully online election. We assume that voter has an electronic ID card to identify himself. He is supposed to choose one of several candidates (no write-ins, no preferences et c.) The coercer follows the voter for the entire voting period. He joins him before the candidates are announced and leaves after the official winner is announced.</strong></p> <p><strong>How would you prevent coercion (and vote selling) under such threat model?</strong></p> <p>Feel free to comment and post your suggestions below.</p> <p>Ready? Steady. Go!</p> <p><strong>Martin Sústrik, January 16th, 2015</strong></p> <div style="height:2em"></div> </div> <p>by <span class="printuser avatarhover"><a href="http://www.wikidot.com/user:info/martin-sustrik" ><img class="small" src="http://www.wikidot.com/avatar.php?userid=939&amp;amp;size=small&amp;amp;timestamp=1438465174" alt="martin_sustrik" style="background-image:url(http://www.wikidot.com/userkarma.php?u=939)" /></a><a href="http://www.wikidot.com/user:info/martin-sustrik" >martin_sustrik</a></span></p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://250bpm.com/blog:46</guid>
				<title>Non-interactive zero knowledge proofs for fun and profit</title>
				<link>http://250bpm.com/blog:46</link>
				<description>

&lt;p&gt;[[div style=&amp;quot;width:50em&amp;quot;]]&lt;/p&gt;
&lt;p&gt;by &lt;span class=&quot;printuser avatarhover&quot;&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;&lt;img class=&quot;small&quot; src=&quot;http://www.wikidot.com/avatar.php?userid=939&amp;amp;amp;size=small&amp;amp;amp;timestamp=1438465174&quot; alt=&quot;martin_sustrik&quot; style=&quot;background-image:url(http://www.wikidot.com/userkarma.php?u=939)&quot; /&gt;&lt;/a&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;martin_sustrik&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
</description>
				<pubDate>Sun, 07 Dec 2014 05:47:49 +0000</pubDate>
												<content:encoded>
					<![CDATA[
						 <div style="width:50em"> <p>Intuitive explanations of the <em>zero knowledge proofs</em> on the web have been rare. There is the classic Alibaba's cave parable, but while it is generally comprehensible it's too simplistic, in my opinion, to explain the topic fully. Luckily, recent article by Matthew Green does a great job at providing an intuitive introduction to the topic:</p> <p><a href="http://blog.cryptographyengineering.com/2014/11/zero-knowledge-proofs-illustrated-primer.html">http://blog.cryptographyengineering.com/2014/11/zero-knowledge-proofs-illustrated-primer.html</a></p> <p>But while good explanations of <em>zero knowledge proofs</em> have been rare, intuitive explanations of <em>non-interactive zero knowledge proofs</em> are virtually non-existent.</p> <p>This article tries to fill in the gap.</p> <p>It builds on the foundation laid by the article linked above, so you'll have to read that one first.</p> <p>Done? If so, let's move on.</p> <p>First of all, let's note that the zero knowledge proofs, as introduced by the article, are based on conversation between the prover and the verifier: Google provides me with a solution covered by hats. I can decide to remove a pair of hats and check whether the uncovered part of the solution conforms to my original requirements. Then Google presents me with another solution covered by hats. Once again, I can remove two hats. And so on. This conversational style of proof is called <em>interactive proof</em>.</p> <p>It's easy to see why the conversation is essential to the proof. If there was no conversation, Google would have to choose the hats to remove themselves. But if they did, it would be trivial for them to fake the correct solution. They would just have to colour the two nodes under those hats in two different colours and don't even care about colouring all the other nodes.</p> <p>The same kind of reasoning, by the way, applies to any kind of interactive proof. To choose a simplest possible example, imagine you are a recruiter doing a job interview for a position that requires knowledge of Spanish. How would you make sure that the applicant actually speaks Spanish?</p> <p>You can simply ask them about Spanish word for &quot;cat&quot;. If they correctly reply that it's &quot;gato&quot; you'll ask what's the Spanish word for &quot;dog&quot;. And so on. After, say, 20 questions you are reasonably sure that the applicant does speak Spanish.</p> <p>However, the whole system breaks down if there is no conversation. If, for example, the applicant just sends their resume by mail and you are supposed to hire them based on that. In such case there is no way to know whether they actually speak Spanish. They can write down 20 words along with their Spanish equivalents but for all you know that may be the only 20 Spanish words they know.</p> <p>All that being said, zero knowledge proofs that would not require conversation (so called <em>non-interactive zero knowledge proofs</em>) would be extremely useful. The case of sending a resume by mail gives you the basic idea. However, there are more ineresting use cases to consider.</p> <p>Imagine, for example, that you want to prove something not to a single person, but rather to the general public. There's something like seven billion people out there, so having a conversation with each one of them is not physically feasible. If there was a way to do zero knowledge proofs in a non-intractive fashion you could just do the proof, publish it on your website and anyone interested could verify it.</p> <p>Alternatively, you may want to prove something, but keep you identity secret. If you were required to have a conversation with the prover, they could track you down relatively easily. If, on the other hand, there were non-interactive zero knowledge proofs, you could just print the proof out and shove it under the verifier's front door.</p> <p>Yet another interesting use case is when you want the proof to be long-lived. If you want it to work after you went offline or even after you are dead the conversation is not an option.</p> <p>Ok, we are facing a dilemma here. Non-interactivity is a highly desirable property. Yet, interactivity seems to be essential to zero knowledge proofs. If there is no conversation, the prover can easily fake the solution. Now what?</p> <p>Let's return to the original problem of colouring a graph. The engineers at Google will now play the part of both prover and verifier. They will take the solution, re-colour the graph in a random way and make a committment for each node (cover it by a hat). Then they will choose a vertex and remove the adjacent pair of hats. The whole process will be repeated multiple times and the results will be sent to me. The more iterations, the more confident I am about Google knowing the correct solution.</p> <p>Everything works exactly as it used to except that the vertex to uncover is chosen by Google instead of me.</p> <p>However, where I have just chosen a vertex in random fashion, Google has to go an extra mile to convince me that they haven't known which vertex will be chosen at the time when they were committing to the solution. If they did know, after all, they could have made a fake solution where only the two nodes to be uncovered were coloured correctly.</p> <p>So, this is what they are going to do: They will take all the commitments (hats) from all the iterations of the proof, join them into a single batch and compute the hash of the batch. The hash will then be treated as if it was a sequence of integers. In each iteration of the proof one integer will be taken from that sequence and used as an index of the vertex to uncover.</p> <p>I am sure you can see the beauty of the solution now. It is as if Google have caught themselves into a trap. They can compute the hash and thus find out which pair of hats to remove only after they've committed to a solution. There's no way to craft the solution in such a way as to force uncovering of any particular vertex. If, on the other hand, they want to cheat and change the solution post hoc they have to change at least one of the commitments (recall that the commitments encode the solution!), which in turn changes the hash. And different hash means that a different vertex will be uncovered. The trap have triggered!</p> <p>I have already vaguely hinted at possible applications of non-interactive zero knowledge proofs, however, to give you a better idea, let me say a few words about the problem that made me thinking about the topic.</p> <p>It had to do with Bitcoin.</p> <p>While Bitcoin does a good job at creating a token of value independent of any state, it has the unfortunate property that all the transactions are public and thus trackable by either the state or some other malevolent entity.</p> <p>What can we do to prevent that?</p> <p>Well, the &quot;amount&quot; field for each Bitcoin address can be encoded by its owner's key. That way the attacker can track the entire blockchain but still have no idea whether any particular transaction corresponds to buying a box of matches or buying the entire country of Belgium.</p> <p>The only problem with that kind of approach is that you have to convince the miners (who are in care of the distributed ledger) that you haven't created money from nothing. Sending 1 Bitcoin from one account, for example, and receiving 1,000,000 Bitcoins to a different account is not a valid transaction. Neither is sending 1,000,000 Bitcoins from an account that contains 1 Bitcoin.</p> <p>This is where the non-interactive zero knowledge proofs kick in. Similarly to Google proving that the adjacent nodes in the graph have different colours you can make a proof that the Bitcoin transaction you are attempting to do plays by the rules. The proof is then sent to the miners along with the transaction itself. Each miner can thus independently verify that the transaction is valid and deserves to be included into the ledger.</p> <p><strong>Martin Sústrik, December 7th, 2014</strong></p> <div style="height:2em"></div> </div> <p>by <span class="printuser avatarhover"><a href="http://www.wikidot.com/user:info/martin-sustrik" ><img class="small" src="http://www.wikidot.com/avatar.php?userid=939&amp;amp;size=small&amp;amp;timestamp=1438465174" alt="martin_sustrik" style="background-image:url(http://www.wikidot.com/userkarma.php?u=939)" /></a><a href="http://www.wikidot.com/user:info/martin-sustrik" >martin_sustrik</a></span></p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://250bpm.com/blog:45</guid>
				<title>The Clockwork inside Game of Thrones</title>
				<link>http://250bpm.com/blog:45</link>
				<description>

&lt;p&gt;[[div style=&amp;quot;width:50em&amp;quot;]]&lt;/p&gt;
&lt;p&gt;by &lt;span class=&quot;printuser avatarhover&quot;&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;&lt;img class=&quot;small&quot; src=&quot;http://www.wikidot.com/avatar.php?userid=939&amp;amp;amp;size=small&amp;amp;amp;timestamp=1438465174&quot; alt=&quot;martin_sustrik&quot; style=&quot;background-image:url(http://www.wikidot.com/userkarma.php?u=939)&quot; /&gt;&lt;/a&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;martin_sustrik&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
</description>
				<pubDate>Thu, 27 Nov 2014 17:14:23 +0000</pubDate>
												<content:encoded>
					<![CDATA[
						 <div style="width:50em"> <p>When Ned Stark got executed by the end of first season of the TV series everyone went like: Ugh! Not possible! He's the protagonist! He just cannot die like this in the middle of the story! Ridiculous! What's the next season going to be about?</p> <p>The deep sense of confusion that the event causes is he result of a completely new story-telling device invented G.R.R. Martin. And when I say &quot;completely new&quot; I don't mean it in the sense of &quot;the best one this season&quot; but rather as &quot;I cannot think of such disruptive change to story telling technique without going back to the middle ages&quot;.</p> <p>Let me explain.</p> <p>Classic story-telling technique is based on building up the tension gradually until it reaches a climax, where it is dissolved. The higher the climax, the more captivating the story. After the climax the story ends. It's called &quot;dramatic structure&quot; and can be depicted like this:</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/blog:45/game1.png" alt="game1.png" class="image" /></div> <p>The image gets a bit more complex when you consider that there may be several mini-climaxes on the way towards the main climax:</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/blog:45/game4.png" alt="game4.png" class="image" /></div> <p>However, that's just a technical nuance. We can ignore it here.</p> <p>The important fact about the dramatic structure is that it doesn't fit to everything that happens in real life. Only a little part of life is actually &quot;dramatic&quot; in this sense&#8230;</p> <p>So, every writer must have at some time or other fancied writing a detective story where the protagonist is investigating a murder, clues are gathered, tension rises, it's finally becoming clear who the suspects are and then the detective is run over by bus. The end.</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/blog:45/game2.png" alt="game2.png" class="image" /></div> <p>The story is entirely plausible, it can very well happen in the real life, but nobody have actually written it. A narrative where dramatic tensions are left unresolved is deeply unsatisfactory to the reader and writing such a story is a literary suicide.</p> <p>Given that classic dramatic structure is incapable of capturing a lot of what's going on in life, there have been multiple experimental attempts to break out of the box. To give just one example, let's mention the absurd drama. However, none of those experiments have made it to the mainstream literature (or, for what it's worth, to the movies). The lack of dramatic structure results in stories that are kind of unsatisfactory and disappointing. That makes it a taboo for those writers who actually care about their audience.</p> <p>Enter G.R.R. Martin.</p> <p>The death of Ned Stark marks the point where classic dramatic structure crumbles, yet the readers are eagerly reading on.</p> <p>So what's going on there?</p> <p>Mr. Martin is cleverly building the tension in the side plots in parallel with the main plot. So, when the main plot crashes in a terrible and unexpected manner, the reader may be dismayed, offended, confused or enraged, but they continue to read on as they still want to know how the side plot ends. At that point the side plot becomes the main plot. Here's a picture:</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/blog:45/game3.png" alt="game3.png" class="image" /></div> <p>Note how the tension may drop occasionally, but it never drops below the level where the reader throws the book away in disgust.</p> <p>And that's not all. If it was, it would just be an intersting story-telling experiment. However, George Martin succeeds in turning this device into a positive generator of dramatic tension.</p> <p>In classic story the reader may be concerned about whether the protagonist dies, however, they are sure that if he does, it will happen by the end of the book. Which, obviously, lowers the tension: &quot;No need to wory now! We still have 200 pages to go!&quot;</p> <p>Not so in A Song of Ice and Fire. There's no guarantee than any character will survive next ten pages. The uncertainty the reader is deliberately kept in proves to be highly addictive. I guess it's the same kind of addiction one gets from gambling. The feeling that anything may happen.</p> <p>As a final note I would like to point out that although George Martin likes to slaughter his characters at a brisk pace, the technique is not limited to killing the protagonists. In fact, it is used all over the series, even if &quot;spoiling the plot&quot; is not always as obvious as a death of an important character and thus may go unnoticed by the reader.</p> <p>Take, for example, the part where Jaime Lannister is freed from captivity by Catelyn Stark and swears to go to King's Landing and exchange himself for lady Stark's daughters who are kept there as hostages. The dramatic tension stems from the fact that Jaime is generally considered to be an oath breaker. Is he going to keep the oath this time or will he just ignore it? The tension increases when he is freed from the escort loyal to lady Stark. Now he's free to do as he wills. He can just turn around and go elsewhere. What a temptation for an oath breaker!</p> <p>It's a great plot, cleverly built tension and all that. So how is it resolved in the end? Well, when Jamie arrives at King's Landing both lady Stark's daughters have already left. And Catelyn Stark and all her family was slaghtered in the meantime. So, even if Jamie decided to keep his oath, there would be nowhere to send the girls to. Jamie is effectively prevented from doing the dramatic decision! The story is ruined. Yuck! What kind of ending is that?</p> <p>All in all, this story-telling technique, although quite demanding on the writer (requiring to deliberately spoil carefully constructed plots) is a huge step forward and even if the popularity of Ice and Fire series fades in the future, G.R.R. Martin deserves to be remembered for inventing it.</p> <p><strong>Martin Sústrik, November 27th, 2014</strong></p> <div style="height:2em"></div> </div> <p>by <span class="printuser avatarhover"><a href="http://www.wikidot.com/user:info/martin-sustrik" ><img class="small" src="http://www.wikidot.com/avatar.php?userid=939&amp;amp;size=small&amp;amp;timestamp=1438465174" alt="martin_sustrik" style="background-image:url(http://www.wikidot.com/userkarma.php?u=939)" /></a><a href="http://www.wikidot.com/user:info/martin-sustrik" >martin_sustrik</a></span></p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://250bpm.com/blog:44</guid>
				<title>Bullshit jobs</title>
				<link>http://250bpm.com/blog:44</link>
				<description>

&lt;p&gt;[[div style=&amp;quot;width:50em&amp;quot;]]&lt;/p&gt;
&lt;p&gt;by &lt;span class=&quot;printuser avatarhover&quot;&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;&lt;img class=&quot;small&quot; src=&quot;http://www.wikidot.com/avatar.php?userid=939&amp;amp;amp;size=small&amp;amp;amp;timestamp=1438465174&quot; alt=&quot;martin_sustrik&quot; style=&quot;background-image:url(http://www.wikidot.com/userkarma.php?u=939)&quot; /&gt;&lt;/a&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;martin_sustrik&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
</description>
				<pubDate>Thu, 27 Nov 2014 07:07:37 +0000</pubDate>
												<content:encoded>
					<![CDATA[
						 <div style="width:50em"> <p>There have been a nice <a href="http://www.smh.com.au/national/public-service/the-modern-phenomenon-of-nonsense-jobs-20130831-2sy3j.html">article by anthropologist David Graeber</a> about what he calls bullshit jobs.</p> <p>I recommend to read it in full, however, in case you don't, here's a short summary:</p> <p>In 1930's Keynes, reflecting on the improving productivity, predicted that by the turn of century we should be working only few hours a day.</p> <p>The productivity did in fact improve but, surprisingly, the working hours are longer today than they've used to be back then.</p> <p>So what's happening here?</p> <p>Mr. Graeber attributes the problem to the rise of &quot;bullshit jobs&quot;.</p> <p>Bullshit job is a job that even the person performing it considers useless. Typically it involves clerical work, moving paperwork around, going to meetings, pretending to be productive and so on. In short, positions like &quot;East Coast Strategic Vision Coordinator&quot;.</p> <p>To save myself the trouble of gathering anecdotal evidence I'll just point you to <a href="http://www.theatlantic.com/business/archive/2014/11/the-art-of-not-working-at-work/382121/?single_page=true">this article</a>.</p> <p>As for the hard evidence for existence of bullshit jobs, I don't have any. Maybe we'll get some data later on, when the phenomenon is actually investigated, but I would not bet my money on that: People are strongly motivated not to be too vocal about the fact that they are wasting their time in work, so collecting data is going to be a problem.</p> <p>Anyway, in the rest of the article I'll stick to the assumption that bullshit work is a phenomenon that actually exists in the real world. After all, if it does not, where do all the productivity gains disappear?</p> <p>The theory of bullshit jobs raises one extremely interesting question: How does it, for goodness' sake, manage to survive in the market economy?</p> <p>Surely the firms spending 20% of their costs on bullshit work would be outcompeted by their more efficient rivals.</p> <p>It's economics 101, right?</p> <p>To understand the gravity of the question, consider that what we see here is the most cherished value of capitalism, the profit, being sacrificed for reasons that are, mildly speaking, unclear. Firms are heavily investing in something that makes them less competitive! Just imagine how large a force we have to be dealing with here given that it's able to outbalance the fundamental force of profit seeking! The whole concept sounds just insane!</p> <p>Should we turn to conspiration theories then? Maybe the capitalists gathered in secret by the end of 60's and decided that all those hippies are far to dangerous to be left running around without any control and agreed to spend part of their profits to keep them at bay by overloading them with useless work?</p> <p>Well, conspiration theories rarely turn out to be true, so let's rather look for a systemic explanation of the phenomenon.</p> <p>And it's not that hard to see what's going on if you allow yourself to think a little bit outside of the box.</p> <p>Do you remember the neat trick from the evolutionary biology where altruism looks like a completely insane idea that nevertheless occurs in the nature &#8212; and how the apparent contradiction is suddenly resolved when you stop thinking about benefits to the organisms and start thinking about benefits to the individual genes?</p> <p>Can we do the same trick here and stop thinking about competing firms and rather start thinking about competing individuals?</p> <p>After all, firm isn't a person and can't really strive to maximise profits. Only people can.</p> <p>And when you look at the problem from the point of view of a rational individual, the situation is pretty clear: I need to work to feed myself and my family. If only 50% of population is needed to do all the necessary work, I have to cope with that. I have to accept a meaningless work if that's all I can get. If it's necessary to keep the work, I have to pretend and lie and cheat and support the whole edifice of bullshit work. Damn, I even have to beg, bribe and blackmail others to create a new bullshit position for me, if I can't get an existing one.</p> <p>So, in the end, large portion of the population, from CEOs to the lowliest interns, are trapped in bullshit jobs and none of them can really speak against it as they are all complicit and the only alternative they have is having no job and starving.</p> <p>The only people who can be opposed are those who do meaningful work &#8212; they will be needed no matter what happens. But even they have friends and family doing bullshit work and can't with good conscience ask the whole system to be dismantled.</p> <p>And that's it. The force that counter-balances the profit seeking is &#8212; surprise, surprise! &#8212; profit seeking, except that it is profit seeking by individuals rather than profit seeking by organisations.</p> <hr /> <p>What follows is a bag of minor observations about the system. If not particularly interested, feel free to skip it. All the important stuff have already been said, you are not going to miss any major point.</p> <h5><span>1. Mechanisms</span></h5> <p>Acknowledging the power of individual profit seeking as the main force behind bullshit jobs is a crucial step to understand the phenomenon. However, the question still remains about actual mechanisms that allow individual interests to prevail over interests of an organisation. Our political and societal system is after all explicitly designed to guarantee at least some efficiency of our institutions &#8212; and those built-in controls have to be somehow disabled or worked around to allow proliferation of bullshit work.</p> <h5><span>2. Government</span></h5> <p>Government and state administration is the primary suspect when looking for bullshit jobs.</p> <p>The feedback loop supposed to guarantee the efficiency is a political one, namely elections. If you are not happy with the efficiency of a particular department at the local tax office, vote for a different party next time.</p> <p>And when it is put this way, it's clear why it doesn't work. The feedback loop is so long and indirect that your vote has exactly zero effect on the staffing decisions of said department.</p> <p>Therefore, there is nothing to counterbalance the incentive for creating bullshit jobs.</p> <h5><span>3. Companies</span></h5> <p>Intuitively, one feels that firms should have better feedback loops than governments. First, they don't have absolute monopoly as the government does. Therefore, we can experiment with multiple of them at the same time and let the inefficient ones die. Second, &quot;voting&quot; is done on day-to-day basis by either buying or not buying their products, thus the feedback loop is much faster than the one provided by quadriennial elections.</p> <p>Still, the feedback loop is badly damaged. There's a disconnect between shareholders of the company which have incentive to make the company as efficient as possible and the management which is incentivised by individual profit &#8212; i.e. has tendency to create bullshit jobs. In the worst case the shareholders buy shares for speculative purposes (consider the case where shares are bought for millisecond intervals by high-frequency trading algortihms!) and don't care about internal working of the company. In such case the management is free to act with no outside control, driven only by the desire for individual profit.</p> <h5><span>4. Network effects</span></h5> <p>Let's now consider the case of a single-person company. The feedback loop is extremely short so there should be no space left for bullshit work, right?</p> <p>Well, having had a single-person company, I can attest that that's not the case.</p> <p>First, you are required to do non-trivial amount of paperwork by government. I have hired an external accountant to deal with that, but the fact that she was not a nominal part of the company and that she did it part-time only conceals the fact that she was doing non-trivial amount of bullshit work on behalf of my company.</p> <p>Second, if you have to deal with big companies you are sucked into their system of bullshit work. When, for example, trying to sell to them, you need a sales department to process all the RfPs and RfQs, to go to the meetings with the client and negotiate the terms, you need lawyers to write down contracts and so on.</p> <p>The bottom line is that because of network effects nobody is free from bullshit work (consider the amount of bullshit work unemployed are required to do!) unless, of course, they are living on a deserted island with no communication with the outside world.</p> <h5><span>5. Social dynamics</span></h5> <p>Having discussed why the traditional control mechanisms fail to push towards efficiency we still have to explain why the result is creating bullshit jobs rather than something else.</p> <p>I believe there are many possible scenarios and that the topic should be researched more extensively. For now, let me give you just one example of social dynamics that lead to creation of bullshit jobs:</p> <p>Imagine two departments, both having five employees, one headed by a good manager, other one by a bad one.</p> <p>The good manager keeps things in order, the department is working efficiently, all the work is done well and on time, there are no complains about the department.</p> <p>The bad manager fails to deliver good work, there are delays all the time and there are complaints from all over the place. The manager is, quite obviously, not going to blame his own inability and instead claims that he needs more people to do the work. Thus, more people are hired and not having much to do they try to validate their existence and invent more paperwork to deal with, organise more meetings and so on.</p> <p>Several years later there's an reorganisation going on and it turns out that the department headed by the good manager is still five people strong while the department headed by the bad manager has hundred employees now. Such imbalance seems irrational and thus the small department becomes part of the big one. The good manager becomes a subordinate of the bad manager and even his former department starts to swell.</p> <h5><span>5. Conditioning</span></h5> <p>One reason not to question the very existence of bullshit jobs is that they are so common. It's simply the way the world is. Nobody have ever experienced an alternative and thus it's kind of hard to even realise there's something wrong with the system. Even young people entering the workforce &#8212; which one would expect to be less biased than their elders &#8212; are already conditioned to accept the idea of meaningless work by spending a decade or two in shools. And the schools are ultimate bullshit job environments. Nothing you ever do in school is useful in any way. School is basically training you to do work for work's sake with the explicit goal of furthering your career (If you won't learn hard enough you'll end up as a garbage man!) rather than striving to do something meaningful.</p> <h5><span>6. Reinforcement</span></h5> <p>Doing bullshit work causes some amount of psychological unease and it seems that some mechanisms have evolved to deal with that. One that readily comes to mind is the constant back-patting to be seen in the environments with high amount of bullshit work. There's a constant flux of congratulations for achieving arbitrary milestones from managers to their subordinates, there are social events to celebrate the achievements and so on. And to stress the point that this is not some kind of plot by higher-level executives, but rather a system where everyone is complicit, there are also replies from the subordinates saying &quot;Yes, we did it! Congratulations to you and the entire team!&quot;, apparently trying to ease the mangers' cognitive dissonance.</p> <h5><span>7. Solutions</span></h5> <p>Finally, one can imagine solutions to the problem. Guaranteed income, for example, would help. If everyone gets enough money to survive without having to work, the incentive to do bullshit work would be largely diminished. The problem with such measures is that it takes time and effort to implement them and until then everyone has to support the existing system just to be able to survive. So it's kind of a deadlock situation.</p> <p><strong>Martin Sústrik, November 27th, 2014</strong></p> <div style="height:2em"></div> </div> <p>by <span class="printuser avatarhover"><a href="http://www.wikidot.com/user:info/martin-sustrik" ><img class="small" src="http://www.wikidot.com/avatar.php?userid=939&amp;amp;size=small&amp;amp;timestamp=1438465174" alt="martin_sustrik" style="background-image:url(http://www.wikidot.com/userkarma.php?u=939)" /></a><a href="http://www.wikidot.com/user:info/martin-sustrik" >martin_sustrik</a></span></p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://250bpm.com/blog:43</guid>
				<title>Magic numbers in C</title>
				<link>http://250bpm.com/blog:43</link>
				<description>

&lt;p&gt;[[div style=&amp;quot;width:50em&amp;quot;]]&lt;/p&gt;
&lt;p&gt;by &lt;span class=&quot;printuser avatarhover&quot;&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;&lt;img class=&quot;small&quot; src=&quot;http://www.wikidot.com/avatar.php?userid=939&amp;amp;amp;size=small&amp;amp;amp;timestamp=1438465174&quot; alt=&quot;martin_sustrik&quot; style=&quot;background-image:url(http://www.wikidot.com/userkarma.php?u=939)&quot; /&gt;&lt;/a&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;martin_sustrik&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
</description>
				<pubDate>Thu, 27 Nov 2014 04:31:25 +0000</pubDate>
												<content:encoded>
					<![CDATA[
						 <div style="width:50em"> <p>I was asked today why I often use magic numbers in my code.</p> <p>It's a small issue, but I feel passionately about it as the current trend of replacing every constant by a symbolic name is extremely annoying. And insistence of some code analysis tools on treating all magic numbers as coding style errors drives me crazy.</p> <p>Just the give you the context, here's an example of magic number:</p> <div class="code"> <pre> <code>i = i + 4;</code> </pre></div> <p>And here's a &quot;fixed&quot; version of the code:</p> <div class="code"> <pre> <code>#define FOO 4 ... i = i + FOO;</code> </pre></div> <p>There are certainly cases where using symbolic names is good style. Two of them come immediately to mind:</p> <p>First, the case where numeric value of the constant is irrelevant. The number is just a placeholder for a symbolic name:</p> <div class="code"> <pre> <code>#define STATE_START 1 #define STATE_ACTIVE 2 #define STATE_SHUTTING_DOWN 3 #define STATE_DONE 4</code> </pre></div> <p>This is basically the use case for C enums or Ruby symbols (&quot;:foo&quot; and such).</p> <p>Second, the case where constant is used as a compile-time configuration parameter:</p> <div class="code"> <pre> <code>#define BUFFER_SIZE 4096</code> </pre></div> <p>The idea here is to define the value at a single point and then use the symbolic name elsewhere. That way, if need arises, we can change the value as needed by modifying a single line of code.</p> <p>All that being said, what about other use cases? Cosider this code (you often encounter that kind of thing when parsing network protocols):</p> <div class="code"> <pre> <code>uint8_t version = *((uint8_t*) ptr); ptr += 1; uint32_t seq = *((uint32_t*) ptr); ptr += 4; uint16_t id = *((uint16_t*) ptr); ptr += 2;</code> </pre></div> <p>What about magic numbers beaing added to 'ptr' in the example? Should we rewrite it in the following way?</p> <div class="code"> <pre> <code>#define VERSION_SIZE 1 #define SEQ_SIZE 4 #define ID_SIZE 2 ... uint8_t version = *((uint8_t*) ptr); ptr += VERSION_SIZE; uint32_t seq = *((uint32_t*) ptr); ptr += SEQ_SIZE; uint16_t id = *((uint16_t*) ptr); ptr += ID_SIZE;</code> </pre></div> <p>Does that make sense? Let's think about it.</p> <p>Does it make the code more readable? Well, no. It actually makes it less readable. If you see a line like this:</p> <div class="code"> <pre> <code>ptr += SEQ_SIZE;</code> </pre></div> <p>You have to find the definition of SEQ_SIZE at the different place in the file or even in a different file. Thus, smooth reading of the source code is interrupted by scrolling up and down, by grepping and so on.</p> <p>So maybe the use of the symbolic constant makes the code more maintainable? Maybe you can adjust the behaviour of the program by simply changing a single line of code? Well, no.</p> <p>This change:</p> <div class="code"> <pre> <code>#define SEQ_SIZE 3</code> </pre></div> <p>doesn't produce a modified program. It produces a broken program.</p> <p>The value of SEQ_SIZE is tightly bound to the program logic (namely to the fact that the field in question is of type uint32_t) and cannot be changed at will.</p> <p>You can still get rid of the magic number like this:</p> <div class="code"> <pre> <code>ptr += sizeof (uint32_t);</code> </pre></div> <p>But I would argue that it makes the code less readable when compared to simply using number &quot;4&quot; and I am not even speaking of extra effort you need to spend when typing the code.</p> <p>As a conclusion: Don't treat magic numbers as bad coding style automatically. In some cases replacing magic numbers by symbolic constants makes the code pretty awful to read and maintain.</p> <p><strong>Martin Sústrik, November 27th, 2014</strong></p> <div style="height:2em"></div> </div> <p>by <span class="printuser avatarhover"><a href="http://www.wikidot.com/user:info/martin-sustrik" ><img class="small" src="http://www.wikidot.com/avatar.php?userid=939&amp;amp;size=small&amp;amp;timestamp=1438465174" alt="martin_sustrik" style="background-image:url(http://www.wikidot.com/userkarma.php?u=939)" /></a><a href="http://www.wikidot.com/user:info/martin-sustrik" >martin_sustrik</a></span></p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://250bpm.com/blog:42</guid>
				<title>Are you a programmer-mathematician or a programmer-handyman?</title>
				<link>http://250bpm.com/blog:42</link>
				<description>

&lt;p&gt;[[div style=&amp;quot;width:50em&amp;quot;]]&lt;/p&gt;
&lt;p&gt;by &lt;span class=&quot;printuser avatarhover&quot;&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;&lt;img class=&quot;small&quot; src=&quot;http://www.wikidot.com/avatar.php?userid=939&amp;amp;amp;size=small&amp;amp;amp;timestamp=1438465174&quot; alt=&quot;martin_sustrik&quot; style=&quot;background-image:url(http://www.wikidot.com/userkarma.php?u=939)&quot; /&gt;&lt;/a&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;martin_sustrik&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
</description>
				<pubDate>Fri, 25 Jul 2014 05:40:08 +0000</pubDate>
												<content:encoded>
					<![CDATA[
						 <div style="width:50em"> <p>I always assumed that programmers hate complex tools and, if given the option, choose simple tools that get the task done over complex multi-purpose ones. (And that complexity we encounter in the tools these days is mainly because develops are paid by hour rather than by delivery. If dealing with a complex tool meant you'll miss your kid's baseball game rather than just the project schedule slipping slightly, there would immediately be an enourmous pressure on tool developers to keep the tools simple and make them just work.)</p> <p>Recent <a href="https://www.tbray.org/ongoing/When/201x/2014/07/17/Discouraged-Developer">blog post from Tim Bray</a> confirms that sentiment (to keep it short, he says that the tools today got so complex that programming ceased to be fun) but <a href="http://www.reddit.com/r/programming/comments/2bi4yz/just_let_me_code/">the subsequent discussion on Reddit</a> shows that lot of people disagree wih him and say, in essence, &quot;Stop whining and start learning the tools.&quot;</p> <p>Putting aside the convenient explanation that it's just the Stockholm syndrome speaking, there are, as far as I can say, two possible explanations for the phenomenon.</p> <p><strong>Theory 1: People become less tolerant of complexity as they grow older.</strong></p> <p>While you are a newbie programmer, learning new technology is fun. It allows you to do new cool stuff and improve your skills in overall.</p> <p>However, after 20 years in the industry learning a new framework that does exactly the same thing for umpteenth time gets annoying and makes you feel that you are wasting your time rather than doing anything useful.</p> <p><strong>Theory 2: Some people are born complexity-lovers while others are born complexity-haters.</strong></p> <p>According to this theory there are programmers who are mathematicians in their hearts. They love abstract problems and divising formally sound solutions for the problems. Once the solution is known, only the trivial task of implemening it remains. Programmer-mathematician wants to get rid of this boring task as fast as possible and thus prefers simple tools which don't require him to learn new stuff or spend his time trouble-shooting the tool.</p> <p>And then there are programmers who are handymen under their skin. They love to assemble stuff. They are McGyver types capable of knitting an anti-shark suit out of old firehose. The ideal milieu for the handyman is a junkyard. After all, junkyard means many more cool anti-shark suits and fire-extinguishing slippers, many more Rube Goldberg machines and Frankenstein monsters. Handymen love complex multi-purpose tools because they are easy to repurpose and to use essentially as spare parts.</p> <p>Finding out wheher theory 1 or theory 2 is correct is relevant for tool developers. In the former case there's no much point in creating complex tools unless, of course, you want to address specifically the segment of newbie developrs and become a star among teenage programmers. In the latter case, on the other hand, your complex tool is meant to be used by handymen and you should thus expect your shiny new web-development framework to become a component in an improvised cheese slicer. You should thus try to relax restricions placed on the user, make the tool easy to take apart, modify and be used it in unexpected ways.</p> <p>Additionally, determining which theory is correct would make an interesing data point for psychology of programming research.</p> <p>So, here's a simple way you can help with finding out what's the case. In the comments below specify numer of years you are programming and choose one of the following options:</p> <blockquote> <p>A: Tim is right. Tools have got too complex these days and should be thoroughly simplified.</p> </blockquote> <blockquote> <p>B: Tim is wrong. He should stop whining and learn the tools.</p> </blockquote> <p>After we have sufficient numer of answers, I intend to chart the percentage of complexity-lovers vs. complexity-haters according to the experience level. If the resulting graph looks like this:</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/blog:42/handyman1.png" alt="handyman1.png" class="image" /></div> <p>Then the theory 1 is correct: People get less complexity-tolerant as they grow more experienced. If, on the other hand, the graph looks like this:</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/blog:42/handyman2.png" alt="handyman2.png" class="image" /></div> <p>Then the theory 2 is correct: Some people are mathematicians while others are handymen. The latter love complexity. It has nothing to do with experience level.</p> <p><strong>Martin Sústrik, July 25th, 2014</strong></p> <div style="height:2em"></div> </div> <p>by <span class="printuser avatarhover"><a href="http://www.wikidot.com/user:info/martin-sustrik" ><img class="small" src="http://www.wikidot.com/avatar.php?userid=939&amp;amp;size=small&amp;amp;timestamp=1438465174" alt="martin_sustrik" style="background-image:url(http://www.wikidot.com/userkarma.php?u=939)" /></a><a href="http://www.wikidot.com/user:info/martin-sustrik" >martin_sustrik</a></span></p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://250bpm.com/blog:41</guid>
				<title>Idiot&#039;s Guide to ABI Versioning</title>
				<link>http://250bpm.com/blog:41</link>
				<description>

&lt;p&gt;[[div style=&amp;quot;width:50em&amp;quot;]]&lt;/p&gt;
&lt;p&gt;by &lt;span class=&quot;printuser avatarhover&quot;&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;&lt;img class=&quot;small&quot; src=&quot;http://www.wikidot.com/avatar.php?userid=939&amp;amp;amp;size=small&amp;amp;amp;timestamp=1438465174&quot; alt=&quot;martin_sustrik&quot; style=&quot;background-image:url(http://www.wikidot.com/userkarma.php?u=939)&quot; /&gt;&lt;/a&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;martin_sustrik&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
</description>
				<pubDate>Wed, 11 Jun 2014 12:01:21 +0000</pubDate>
												<content:encoded>
					<![CDATA[
						 <div style="width:50em"> <p>Libtool ABI versioning is used in both nanomsg and ZeroMQ, so I've been dealing with it for a long time now.</p> <p>However, for some not completely obvious reason it seems almost impossible to remember how it works. I've observed myself as well as other people forgetting the rules over and over again.</p> <p>This short article is my attempt to formulate libtool ABI versioning rules in such a way that they can be actually remembered.</p> <p>First of all, don't think of the ABI version major/minor/patch version, e.g. &quot;4.23.1&quot;, rather, think of it as just major/minor version (&quot;4.23&quot;) with a special additional number called &quot;age&quot; which is conceptually different from the other two numbers. The right way to think of that particular ABI version is thus &quot;version 4.23, age 1&quot;.</p> <p>Major and minor (called <em>current</em> and <em>revision</em> in libtool terminology) work exactly as you would expect them to work. The only rule to follow is that any change to ABI requires new major version. Thus, minor versions are used only to distinguish between versions that have exactly the same interface but different implementations.</p> <p>&quot;Age&quot; is the confusing part. What it refers to is backward compatibility. Specifically, how many major versions are backward compatible with the current one. In case of &quot;version 4.23, age 1&quot; there's exactly one old major version (3.x) compatible with the current major version (4.x), thus, if you have code compiled with version 3.x it should still work if you link it to 4.x library.</p> <p>And that's it.</p> <p>For canonical info on libtool versioning look here:</p> <p><a href="http://www.gnu.org/software/libtool/manual/html_node/Libtool-versioning.html">http://www.gnu.org/software/libtool/manual/html_node/Libtool-versioning.html</a><br /> <a href="http://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html">http://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html</a></p> <p><strong>Martin Sústrik, Jun 10th, 2014</strong></p> <div style="height:2em"></div> </div> <p>by <span class="printuser avatarhover"><a href="http://www.wikidot.com/user:info/martin-sustrik" ><img class="small" src="http://www.wikidot.com/avatar.php?userid=939&amp;amp;size=small&amp;amp;timestamp=1438465174" alt="martin_sustrik" style="background-image:url(http://www.wikidot.com/userkarma.php?u=939)" /></a><a href="http://www.wikidot.com/user:info/martin-sustrik" >martin_sustrik</a></span></p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://250bpm.com/blog:40</guid>
				<title>Unit Test Fetish</title>
				<link>http://250bpm.com/blog:40</link>
				<description>

&lt;p&gt;[[div style=&amp;quot;width:50em&amp;quot;]]&lt;/p&gt;
&lt;p&gt;by &lt;span class=&quot;printuser avatarhover&quot;&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;&lt;img class=&quot;small&quot; src=&quot;http://www.wikidot.com/avatar.php?userid=939&amp;amp;amp;size=small&amp;amp;amp;timestamp=1438465174&quot; alt=&quot;martin_sustrik&quot; style=&quot;background-image:url(http://www.wikidot.com/userkarma.php?u=939)&quot; /&gt;&lt;/a&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;martin_sustrik&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
</description>
				<pubDate>Thu, 05 Jun 2014 09:48:14 +0000</pubDate>
												<content:encoded>
					<![CDATA[
						 <div style="width:50em"> <p>I hear that prople feel an uncontrollable urge to write unit tests nowaydays.</p> <p>If you are one of those affected, spare few minutes and consider these reasons for NOT writing unit tests:</p> <h3><span>1.</span></h3> <p><strong>Return on investment of unit tests is an order(s) of magnitude lower than that of end-to-end tests.</strong></p> <p>If you write a single end-to-end test the execution of the test will probably cover a substantial part of the codebase:</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/blog:40/fet1.png" alt="fet1.png" class="image" /></div> <p>If you write a single unit test, you cover just a small piece of the codebase:</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/blog:40/fet2.png" alt="fet2.png" class="image" /></div> <p>If you are painting a house, you want to start with a biggest brush at hand and spare the tiny brush for the end to deal with fine details.</p> <p>If you begin your QA work with unit tests, you are essentially trying to paint entire house using the finest chinese calligraphy brush.</p> <h3><span>2.</span></h3> <p><strong>End-to-end tests test the critical path. Unit test do not.</strong></p> <p>End-to-end tests typically simulate real-world usage scenarios. Thus, after running end-to-end test you have reasonable level of confidence that the product is going to work for the actual user.</p> <p>If all you have are unit tests, you are pretty sure that all the individual gears inside the project work as expected. However, you have no idea whether the project as a whole works or not. It may well be that the user won't be able to performs a single task.</p> <p>Yes, unit tests are rigorous and make sure that the component will work even in corner cases. However, user wants the product to work in common cases in the first place. If it fails in common cases it's not a product. It's a failure.</p> <p>On the other hand, if the product fails in exotic cases that happen rarely or never at all, the defect can be tolerated and possibly fixed later on.</p> <h3><span>3.</span></h3> <p><strong>Unit tests ossify the internal architecture.</strong></p> <p>Imagine you have three components, A, B and C. You have written extensive unit test suite to test them.</p> <p>Later on you decide to refactor the architecture so that functionality of B will be split among A and C. you now have two new components with diferent interfaces.</p> <p>All the unit tests are suddenly rendered useless. Some test code may be reused but all in all the entire test suite has to be rewritten.</p> <p>This way the unit test suite makes the product resistent to internal change. A programmer with limited time to allocate on tasks will consider the refactoring, then consider the cost of rewriting the test suite and place to whole endeavour into the &quot;not worth it&quot; mental slot.</p> <p>I have seen a nice definition of &quot;architecture&quot; somewhere. It said that architecture is that what makes software able to change. If we embrace this point of view, unit tests can be considered to be a strictly an anti-architecture device.</p> <h3><span>4.</span></h3> <p><strong>There are things that can't be unit-tested.</strong></p> <p>Consider a protocol decoder. The protocol says &quot;9th byte in the packet is called TLL&quot;.</p> <p>The implementation does this:</p> <div class="code"> <pre> <code>int ttl = packet [8];</code> </pre></div> <p>How are you supposed to unit test that? You can create a fake packet that has value 123 as 9th byte and then check that the decoder extracts TTL of 123. How is that different from testing that 1 == 1 though?</p> <p>Protocol layout is a definition, not an algorithm and there's little to test there. What works though is interoperability testing: Take two implementations of the same protocol and check whether they speak each to another. And once again, we end up with end-to-end tests instead of unit tests.</p> <h3><span>5.</span></h3> <p><strong>Some stuff has no rigorous acceptance criteria.</strong></p> <p>Some of the code we write can be described in almost mathematical rigour: lists, hashmaps, 3D graphics et c.</p> <p>Other code is not meant to be defined so rigorously. It's meant to feel good to the user. The most obvious example is GUI.</p> <p>In such cases there's no right or wrong behaviour, there's only good or bad behaviour. And good or bad is much harder to define and test.</p> <p>Admittedly, end-to-end tests and unit tests face the same problem here.</p> <p>However, it's likely that the project has a high-level description of how the end user experience should look like. You can use that as basis for the tests.</p> <p>Not so with individual components. After all, what does &quot;end user experience&quot; even mean for an internal component? In such case you should think of components as malleable pieces of code whose sole purpose it to enable globally (end-to-end) defined user experience. If the end-to-end experience is OK, the component is OK as well.</p> <h3><span>6.</span></h3> <p><strong>Unit tests are easy to measure and you should fear that.</strong></p> <p>If you are working in a hierarchically structured company, beware!</p> <p>Progress on project has to be reported up the command chain and with a mainly spiritual activity such as software development it's pretty hard to find any hard metrics. How the hell are we supposed to measure QA?</p> <p>And there's an easy solution to that: Report number and/or code coverage of unit tests!</p> <p>But that leads into a trap: Once you start reporting code coverage of unit tests, you'll be immediately pressured to improve the metric, with the goal of achieving 100% coverage, despite all the problems with unit testing outlined above. The problems, after all, are not quantifiable, but the code coverage is.</p> <p>It's hard to say how to fight the scenario above. Maybe keeping the unit test coverage so embarassingly low that nobody even thinks of reporting it to their superiors would help&#8230;</p> <h3><span>7.</span></h3> <p>All that being said, unit tests are great for testing complex algorithmical tasks with strictly defined behaviour and lot of corner cases. Don't even think about implementing a red/black tree without an extensive suite of unit tests.</p> <p>However, be frank: How often do you implement red/black trees?</p> <p>And more generally: Is there really a rational justification for all the unit tests you are writing?</p> <p>Think about it for a minute and you may spare yourself a lot of useless work.</p> <p><strong>EDIT:</strong> There seems to be some confusion about what is &quot;unit test&quot; and what is &quot;end-to-end test&quot;. In the context of this article, &quot;end-to-end&quot; test means test that uses external interface of the product (one that is visible to the end user). &quot;Unit test&quot;, on the other hand, means test that uses internal interfaces within the product (those which are not visible to the end user).</p> <p><strong>Martin Sústrik, Jun 4th, 2014</strong></p> <div style="height:2em"></div> </div> <p>by <span class="printuser avatarhover"><a href="http://www.wikidot.com/user:info/martin-sustrik" ><img class="small" src="http://www.wikidot.com/avatar.php?userid=939&amp;amp;size=small&amp;amp;timestamp=1438465174" alt="martin_sustrik" style="background-image:url(http://www.wikidot.com/userkarma.php?u=939)" /></a><a href="http://www.wikidot.com/user:info/martin-sustrik" >martin_sustrik</a></span></p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://250bpm.com/blog:39</guid>
				<title>Design of PUB/SUB subsystem in ØMQ</title>
				<link>http://250bpm.com/blog:39</link>
				<description>

&lt;p&gt;[[div style=&amp;quot;width:50em&amp;quot;]]&lt;/p&gt;
&lt;p&gt;by &lt;span class=&quot;printuser avatarhover&quot;&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;&lt;img class=&quot;small&quot; src=&quot;http://www.wikidot.com/avatar.php?userid=939&amp;amp;amp;size=small&amp;amp;amp;timestamp=1438465174&quot; alt=&quot;martin_sustrik&quot; style=&quot;background-image:url(http://www.wikidot.com/userkarma.php?u=939)&quot; /&gt;&lt;/a&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;martin_sustrik&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
</description>
				<pubDate>Sun, 04 May 2014 05:46:25 +0000</pubDate>
												<content:encoded>
					<![CDATA[
						 <div style="width:50em"> <p><strong>This article was originally written on June 23rd 2011. As there was some discussion of subscription forwarding on nanomsg mailing list, I am republishing an enhanced version of the article it in this blog.</strong></p> <h1><span>Introduction</span></h1> <p>PUB/SUB (publish/subscribe) is a messaging pattern used to distribute data to arbitrary number of subscribers. In context of ØMQ the focus with PUB/SUB is on scalability, ie. on ability to add more subscribers to the system without overloading it.</p> <h1><span>Minimising Network Traffic</span></h1> <p>Aside of the data distribution itself, the main design goal for PUB/SUB subsystem is to minimise the network traffic. Afterall, what PUB/SUB boils down to is multicasting which is notorious for overloading the networks. Thus, the design described herewith is based on the following requirement:</p> <p><em>If there's no subscriber for a particular message in a specific portion of the distribution tree the data should not even traverse associated network segments.</em></p> <p>While the requirement seems to be self-obvious, it is often not honoured in messaging systems in use today. In most broker-based systems consumers subscribe for messages with the broker, however, there's no way for broker to subscribe for messages with the publisher. So, even if there is no consumer interested in the message it is still passed from the publisher to the broker, just to be dropped there:</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/pubsub/ps1T.png" alt="ps1T.png" class="image" /></div> <p>What we want to achieve instead is a system where the message nobody is interested in is dropped as soon as possible, presumably even before being sent to the network:</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/pubsub/ps2T.png" alt="ps2T.png" class="image" /></div> <p>We also want this property to be transitive, ie. even if there are multiple intermediary nodes (brokers, devices) between publisher and consumer, the unneeded messages should still be dropped as soon as possible:</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/pubsub/ps3T.png" alt="ps3T.png" class="image" /></div> <p>More generically, we want message to travel only through those lattices in the message distribution tree that lead to consumers interested in the message:</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/pubsub/ps4T.png" alt="ps4T.png" class="image" /></div> <p>Additionally, it would be preferable to pass message through any edge of the graph once only even if there are multiple consumers down that branch of the tree. In other words, message duplication should happen as far away down the stream as possible &#8212; in this case in the node D rather than in the node P.</p> <h2><span>Subscription Forwarding</span></h2> <p>To achieve this kind of design subscriptions have to be passed to the upstream nodes so that the upstream nodes can use them to avoud sending messages to the paths where there are no interested subscribers. The property should be transitive, ie. upstream node should forward the subscription to its parent which in turn should pass it to its parent etc.</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/pubsub/ps5T.png" alt="ps5T.png" class="image" /></div> <p>So, when consumer application subscribes to a topic the subscription, in addition to applying locally, hops up the stream until it reaches the ultimate producer(s).</p> <p>The technical difficulty with this model is how to pass the subscription through the intermediate devices. Recall that a device is just a pair of ØMQ sockets connected by user code. Thus, there's no way to implement subscription forwarding in devices without user code taking part in the process. And that's what XPUB/XSUB sockets are for.</p> <p>XPUB is similar to a PUB socket, except that you can receive messages from it. The messages you receive are the subscriptions traveling upstream. XSUB is similar to SUB except that you subscribe by sending a subscription message to it.</p> <p>Subscription messages are composed of a single byte, 0 for unsubscription and 1 for subscription, followed by the subscription itself, ie. the prefix we want to check the messages for.</p> <p>This makes implementation of a device quite simple: Use XPUB/XSUB sockets instead of PUB/SUB, pass all the messages from XSUB to XPUB (data flow) and all the messages from XPUB to XSUB (subscription flow):</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/pubsub/ps6T.png" alt="ps6T.png" class="image" /></div> <h1><span>Disconnections and Reconnections</span></h1> <p>To avoid storing stale subscriptions in individual nodes, disconnection of a peer should result in automatic generation all the relevant unsubscriptions to be passed upstream. For example, if XPUB socket has a single peer who subscribed for topic &quot;A&quot;, disconnection of that peer should result in generation of unsubscription &quot;A&quot; that will hop upstream until it reaches terminal producer(s).</p> <p>The other way round, when reconnection happens, XSUB socket should send all its subscriptions to the reconnected peer to reestablish the message flow. Recall that the peer (XPUB) have generated unsubscriptions when disconnection happened. By re-sending the subscriptions the routing tables on the publisher side are renewed and the message distribution continues to work as expected.</p> <h1><span>Message routing &amp; filtering</span></h1> <p>Let's assume the subscriptions are correctly distributed within the message distibution tree. Let's now have a look at how they should be used by individual nodes to route the messages downstream.</p> <p>The routing in (X)PUB socket is pretty straightforward: If message matches any subscription received from a particular peer, send it to that peer. Care should be taken to send the message once only even if it matches several subscriptions.</p> <p>It's less obvious whether messages should be filtered in the (X)SUB socket. There are three aspects to take into account:</p> <ol> <li>In case of uni-directional transports such as PGM, the PUB socket does no filtering. Thus, SUB socket gets the whole feed of messages and has to filter out unneeded messages itself.</li> <li>When user unsubscribes from a certain topic, the unsubscription is sent upstream and once it reaches the terminal producer, the producer won't sent any corresponding messages downstream. However, this operation takes time. In the meanwhile messages that match the no longer existent subscription keep flowing downstream.</li> <li>Device reads messages from XSUB socket and passes them to XPUB socket. Thus, the filtering would be done twice in a row is filtering is implemented in both XPUB and XSUB socket.</li> </ol> <p>To address these problems, ØMQ filters messages in SUB socket, but not in XSUB socket.</p> <h1><span>Subscription Forwarding and Uni-directional Transports</span></h1> <p>One specific problem with the above model is that it requires bi-directional communication: data are passed one way, subscriptions are passed the other way. Thus, we have to reconcile this model with uni-directional transports such as PGM (reliable multicast).</p> <p>The solution at the moment is not to forward subscriptions upstream from PGM consumer socket to the PGM publisher. Additionally XPUB socket would send a &quot;subscribe to all&quot; subscriptions upstream once it is bound to PGM transport. That way every message from upstream is multi-casted and filtering happens in the XSUB socket.</p> <p>This solution is clearly not minimal. Messages are passed on the network even though nobody is subscribed to them. This can be &#8212; and often is &#8212; considered acceptable with multicast transports. However, if there's need to optimise this scenario further future work can focus on two areas:</p> <ol> <li>Using IGMP as a way to pass subscriptions upstream via PGM. This would of course limit the filtering capabilities to &quot;subscribe to this multicast group&quot; instead of &quot;subscribe to all messages starting with this prefix&quot;. It should also be noted that IGMP passes subscriptions only from consumer to the network switch/router. They aren't passed any further (from the switch to the publisher).</li> <li>Creating an additional data flow from PGM subscriber to PGM publisher (via TCP, UDP or whatever) thus basically turning uni-directional transport into a bi-directional one.</li> </ol> <h1><span>Subscription Aggregation</span></h1> <p>There's is one additional scalability bottleneck with the above model.</p> <p>Imagine the case of global TV broadcast over ØMQ. All the viewers are subscribing to different channels, some of them are changing channels in pretty high rate etc. If all the subscriptions were forwarded all the way up to the ultimate broadcasting server, the server would spend all its time processing subscriptions with no time remaining to broadcast the content itself. Or, more likely, it would collapse under the weight of all the (un)subscriptions coming in.</p> <p>To make the PUB/SUB model really scalable we have to aggregate the subscriptions in such a way that subscription traffic at each lattice of the distribution tree is roughly the same.</p> <p>To achieve that we can send just the diffs rather than passing each individual subscription. That way we can ensure that the subscription traffic at publisher won't grow exponentially with depth of the tree. The current implementation does so by eliminating duplicate (un)subscriptions.</p> <p>Imagine an XPUB socket. Consumer connects and subscribes for messages starting with &quot;A&quot;. XPUB socket forwards the subscription upstream. Then another consumer connects and subscribes for &quot;A&quot;. This time, XPUB socket increases the reference count on it's internal represenation of the subscription, but doesn't send the subscription upstream &#8212; it's already there so why bother?</p> <p>The other way round, when the first consumer unsubscribes or disconnects, XPUB socket decrements the reference count on the topic, but doesn't forward anything upstream. There's still the second consumer that needs the &quot;A&quot; messages anyway. Only when the second consumer disconnects as well, the unsubscription is sent upstream.</p> <p>The algorithm above is clearly not optimal and can be improved in the future.</p> <p>If there an &quot;A&quot; subscription, &quot;AB&quot; subscription is still passed upstream even though it is subsumed by &quot;A&quot;. We could simply drop the &quot;AB&quot; subscription and everything would work as expected. We would have to compensate for the optimisation during unsubscription from &quot;A&quot; though. Instead of sending <em>unsubscribe(A)</em> up the stream we would have to send sequence of <em>subscribe(AB)</em> and <em>unsubscribe(A)</em>.</p> <p>You can imagine different optimisations to the aggregation algorithm.</p> <p>However, the crucial point is that the PUB/SUB protocol is designed in such a way that the actual aggregation algorithm doesn't matter. Every node in the distribution tree can use its own aggregation algorithm and the whole distribution tree would still work as expected. Hopefully, different implementations of scalability protocol can take advantage of the fact and differentiate on the basis of more or less efficient subscription aggregation algorithms.</p> <h1><span>Plugging into the Subscription Mechanism</span></h1> <p>Imagine the case where producing a message is a CPU-intensive task. Producer would like to eliminate the work in case there's nobody interested in the message. To do that it has to be aware of the subscriptions issued by the downstream nodes.</p> <p>Luckily, this is exactly what XPUB socket allows it to do. It can open an XPUB socket and start listening for the subscriptions. If a message is about to be generated, the producer will check its list of subscriptions and proceed only if there's an appropriate subscription present.</p> <p>Another scenario for plugging into subscription mechanism is creating bridges to other messaging implementations, both generic and ad hoc in-house systems.</p> <p>Let's take an example of ØMQ/AMQP bridge. The bridge can create an XPUB socket to send messages from AMQP to ØMQ. The other way round it can read the subscriptions from the socket and send them as <em>queue.bind</em> commands to the AMQP system:</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/pubsub/ps7T.png" alt="ps7T.png" class="image" /></div> <p>The same way, bridges can be built to interface with other messaging systems: JMS, STOMP, PubSubHubbub, Redis etc.</p> <h1><span>Subscription Forwarding as an Local Optimisation</span></h1> <p>Conceptually, I find it useful to think of forwarding subscriptions to publishers as of an optimisation:</p> <p>In the beginning you have the distribution tree, the messages flow from the publisher to each subscriber. The ultimate subscribers may choose to drop some of the messages. That's how it's implemented in nanomsg at the moment.</p> <p>Then, some parts of the tree may implement local optimisation by subscriber letting the publisher know that it won't need some messages.</p> <p>Looking at the problem from this point of view is pretty useful, as it allows us to think of multicast parts of the tree not as some kind of oddity, but rather fully standard nodes which happen not to implement the optimisation.</p> <h1><span>Different Matching Algorithms</span></h1> <p>Sometimes, we don't want to match using a prefix of the message, as it is done in both ØMQ and nanomsg. Instead, we would like to match using regular expressions, SQL-like queries of whatever other matching mechanisms you can think of. This functionality was part of Crossroads (now defunct fork of ØMQ) and should be taken into the consideration when standardising the SP PUB/SUB protocol.</p> <p>Note that treating the subscription forwarding as an optional optimisation allows the system to ignore subscriptions of unknown types and treat them as &quot;send me everything&quot; subscriptions.</p> <h1><span>Issues</span></h1> <p>There's one fundamental problem with subscription forwarding:</p> <p>PUB/SUB allows for multiple publisher/single consumer topologies.</p> <p>What it means is that the subsription are sent to multiple destinations. At the same time, the mechanism for passing subscriptions has to be fully reliable (losing a subscription is not an option as it would result in faulty business logic).</p> <p>And as I've tried to explain in many places reliable transmission to multiple destinations is possible only if you accept the fact that a slow/stuck/hung-up destination can cause the whole distribution tree to slow down or grind to the halt.</p> <p>Obviously, we cannot accept that in PUB/SUB protocol.</p> <p>There were many solutions for the problem discussed, all of them badly lacking in one or an other way:</p> <ol> <li>Don't allow more that one publisher per subscriber.</li> <li>If there's more than one publisher, turn the subscription forwarding off.</li> <li>Allow for forwarding subscriptions only one hop upstream, thus limiting the possible damage to a single tier of subscribers.</li> <li>One solutions that is typically mentioned is &quot;kill the connection to the slow/hung-up publisher&quot;. The problem with that is the connection will be re-connected in a short time and then all the subscriptions will be resent, making the congestion situation even worse (you can think of it as a connection-based equivalent of retransmit storm).</li> </ol> <p>One solution that looks like it may work is to limit the number of subscriptions from a single consumer to a small number (say 32) and subscribe to the full message feed if any further subscription is issued.</p> <h1><span>Conclusion</span></h1> <p>This article proposes a PUB/SUB system with minimal bandwidth requirements as it is currently implemented in ØMQ/3.0.</p> <p>The long-term goal is to reconcile this system with systems used by different messaging implementations and standardise it as an SP protocol.</p> <p><strong>Martin Sústrik, May 4th, 2014</strong></p> <div style="height:2em"></div> </div> <p>by <span class="printuser avatarhover"><a href="http://www.wikidot.com/user:info/martin-sustrik" ><img class="small" src="http://www.wikidot.com/avatar.php?userid=939&amp;amp;size=small&amp;amp;timestamp=1438465175" alt="martin_sustrik" style="background-image:url(http://www.wikidot.com/userkarma.php?u=939)" /></a><a href="http://www.wikidot.com/user:info/martin-sustrik" >martin_sustrik</a></span></p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://250bpm.com/blog:38</guid>
				<title>Code Generation &amp; Visual Smog in Code (part II)</title>
				<link>http://250bpm.com/blog:38</link>
				<description>

&lt;p&gt;[[div style=&amp;quot;width:50em&amp;quot;]]&lt;/p&gt;
&lt;p&gt;by &lt;span class=&quot;printuser avatarhover&quot;&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;&lt;img class=&quot;small&quot; src=&quot;http://www.wikidot.com/avatar.php?userid=939&amp;amp;amp;size=small&amp;amp;amp;timestamp=1438465175&quot; alt=&quot;martin_sustrik&quot; style=&quot;background-image:url(http://www.wikidot.com/userkarma.php?u=939)&quot; /&gt;&lt;/a&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;martin_sustrik&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
</description>
				<pubDate>Sun, 27 Apr 2014 11:13:12 +0000</pubDate>
												<content:encoded>
					<![CDATA[
						 <div style="width:50em"> <p>(continued from the previous episode&#8230;)</p> <h3><span>14.</span></h3> <p>Now that we've dealt with repetitive code and escape sequences, here comes the actual meat of the problem: The indentation.</p> <p>Indentation is a way to employ the 2D shape recognition capability built into the human brain in the service of understanding the code. If the indentation is broken, our innate 2D scanner switches off and we are left with having to understand the structure of the code the hard way, the way the compilers do. This off-switch of programmer's 2D scanner is used, for example, by the obfuscation technique that removes all the layout and present the whole program as a single line of code or as an ASCII art.</p> <p>As far as code generation is concerned, there are two distinct problems to deal with: First, generator's source code has to be indented, second, the generated code has to be indented. As we'll see later on the two are interconnected in a subtle manner. However, for now, let's just note that the first problem is much more important to solve, as the generated code is mostly meant to be consumed by a machine and thus doesn't have to be particularly pretty.</p> <h3><span>15.</span></h3> <p>Indenting of the code generator's source code is a hard problem and one that cannot be solved by technical means.</p> <p>I am aware that when one asserts that something cannot be done he's just asking to be exposed as an idiot by someone who actually does it. However, let's consider my argument first:</p> <p>The code generator and the generated program are two very distinct algorithms. After all, the first just concatenates some strings while the latter controls, say, a shoe factory.</p> <p>That being a fact, their internal logical structure is going to be very different. Thus, if we want to use indentation to expose the logical structure of both, we are going to end up with two unrelated interleaved indentations:</p> <div class="code"> <pre> <code>.foo = bar; if(a == 0) .++foo; if(b == 0) .if (foo == 3) . bar = 0; if(c == 0) .++bar; end .foo = 0; end end</code> </pre></div> <h3><span>16.</span></h3> <p>For all I can say, the problem is inherent to interleaving two distict programs and there's no way to twist it in such a way as to end up with a single indentation and yet preserve the ability to 2D-scan both programs.</p> <p>So, let's accept double indentation as a fact. Is there a way to preserve 2D-scanability under such circumstances?</p> <p>Think about it for a while! What we need is to see two different things in a single chunk of code&#8230;</p> <p>And by now those with some psychology background should recall they have encountered something similar in the past.</p> <p>And yes, you are right. It's the <a href="https://en.wikipedia.org/wiki/Necker%27s_cube">Necker cube</a>!</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/blog:38/necker.png" alt="necker.png" class="image" /></div> <p>Necker cube is a single drawing that can be interpreted in two different ways:</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/blog:38/necker2.png" alt="necker2.png" class="image" /></div> <p>From our point of view the most important property of Necker cube is that altough the observer can't see both cubes at the same time, he can switch between the two representations at will.</p> <p>What we need to do is to achieve the same flip-flop effect with the doubly-indented code.</p> <p>Switching between the two cubes happens by imagining that some lines are invisible (the dotted lines on the picture above). Sadly, the same trick doesn't really work with the ribosome source. You can try to imagine that the lines startng with a dot in the code sample above are invisible and focus only on the non-dotted lines. I expect it will give you hard time and the results won't be spectacular.</p> <p>We can help our imagination though by using our colour preception capability. Here's the syntax-highlighted version of the code. Focus on the black code first, then try to switch to the blue code:</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/blog:38/vim1.png" alt="vim1.png" class="image" /></div> <p>Have you been able to achieve the Necker-cube-like flip-flop effect? Have you seen black well-indented code first and blue well-indented code later?</p> <p>Maybe not. It's not that easy. However &#8212; and I've tested it myself &#8212; if you spend an hour writing the code highlighted like that you'll get much better at switching between the two representations.</p> <h3><span>17.</span></h3> <p>The syntax highlighting can be thought of as a computer-aided Necker cube effect. But why limit ourselves to plain old syntax highlighting? Surely, computer must be capable of doing more to help us.</p> <p>When you put aside options that are technically unfeasible at the moment, like using our innate 3C recognition capability (control language on the foreground, generated language on the background), we can resort to yet more syntax highlighting.</p> <p>Specifically, the computer can simulate the flip-flop effect for us.</p> <p>If you install vim plugin for ribosome, by default you will be see the DNA files in black/blue colouring, as shown above.</p> <p>However, by pressing F3, the control language will get full Ruby syntax colouring while the generated language will be shaded, so as not to distract from the perception of the control language:</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/blog:38/vim2.png" alt="vim2.png" class="image" /></div> <p>By pressing F4 you'll get exactly the opposite effect. Control language will be shaded and the generated language will be properly syntax-highlighted (you'll have to let vim know what language you are generating though):</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/blog:38/vim4.png" alt="vim4.png" class="image" /></div> <p>By alternatively pressing F3 and F4 you are in fact generating the full-blown computer-aided Necker cude effect.</p> <h3><span>18.</span></h3> <p>We'll return to the indentation of ribosome source later, but now let's have a look at the indentation of the generated code.</p> <p>As already said, it's not that important, because it's something that will be processed by a machine anyways. There are two caveats though:</p> <ol> <li>What if we are generating Python program? If so, identation must be done correctly, otherwise the program won't work.</li> <li>What if the output is meant to be human-readable anyway? If so, indentation is a must. Non-indented program in a non-readable program.</li> </ol> <h3><span>19.</span></h3> <p>To understand the problem we are dealing with here, consider the following program:</p> <div class="code"> <pre> <code>def greet(name) .printf (&quot;Hello, &quot;); .printf (&quot;@{name}!&quot;); .printf (&quot;\n&quot;); end .@{greet(&quot;Alice&quot;)} .if (is_bob_present) { . @{greet(&quot;Bob&quot;)} .}</code> </pre></div> <p>The output of such program &#8212; if done naively &#8212; would look like this:</p> <div class="code"> <pre> <code>printf (&quot;Hello, &quot;); printf (&quot;Alice!&quot;); printf (&quot;\n&quot;); if (is_bob_present) { printf (&quot;Hello, &quot;); printf (&quot;Bob!&quot;); printf (&quot;\n&quot;); }</code> </pre></div> <p>Note the mis-indentation in the lines 6 and 7! And with large non-toy programs, the problem will become ubiquitous &#8212; there rarely will be a line that's well indented.</p> <h3><span>20.</span></h3> <p>If you run the above code with ribosome, the result will be correctly indented though. So what's going on here?</p> <p>Before diving into technical details, recall how you deal with the same problem when you are editing code by hand.</p> <p>While most of the time you spend editing individual lines of code, when the indentation is involved you often select a whole block of code and move it to the left or to the right as needed. Individual editors provide different means to make this operation easy to perform. For example, in most WYSIWYG editors selecting a block of code and pressing Tab moves the block to the right.</p> <p>Can the code generator do a similar thing for us?</p> <p>And the answer is, obviously, yes. The only thing we have to do is to stop thinking in terms of lines of code and rather think about rectangular blocks of code.</p> <h3><span>21.</span></h3> <p>Technically, all string literals are trated as simple one-line block of code, while output of any embedded expression is treated as a rectangular block of code, whether is consists one line or mulitple lines.</p> <p>The the following source code:</p> <div class="code"> <pre> <code>. @{greet(&quot;Bob&quot;)}</code> </pre></div> <p>Can be graphically represented like this:</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/blog:38/ribo9.png" alt="ribo9.png" class="image" /></div> <p>There's a simple literal block of 4 space characters on the left and a 3 line long block generated by function <em>greet</em> on the right.</p> <h3><span>22.</span></h3> <p>Note how this system works in recursive manner. You can use it to properly indent pieces of code within your program, but your program as a whole can be treated as a single block of code and indented as needed:</p> <div class="code"> <pre> <code>def greet(name) .printf (&quot;Hello, &quot;); .printf (&quot;@{name}!&quot;); .printf (&quot;\n&quot;); end def greetall() .@{greet(&quot;Alice&quot;)} .if (is_bob_present) { . @{greet(&quot;Bob&quot;)} .} end .#include &lt;stdio.h&gt; . .int main() { . @{greetall} . return 0; .}</code> </pre></div> <p>And here's the output, showing blocks of code as they were processed on different levels of the call stack:</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/blog:38/ribo10.png" alt="ribo10.png" class="image" /></div> <h3><span>23.</span></h3> <p>All the above can be boiled down to a single leading principle: It's not the callee's task to manage the layout of the block being generated. Rather, it's caller's responsibility to place the block to the appropriate position.</p> <p>What does that mean for the whitespace in the generated block? If every line of the block starts with four spaces, should we carefully preserve them, or should we rather say that it's not the callee's job to care about indentation and ignore the whitespace as random noise?</p> <p>In ribosome I've opted for the latter choice:</p> <div class="image-container aligncenter"><img src="http://250bpm.wdfiles.com/local--files/blog:38/ribo11.png" alt="ribo11.png" class="image" /></div> <p>Note how the whitespace on the left is not part of the generated block!</p> <h3><span>24.</span></h3> <p>While this choice looks somehow arbitrary, it has not entirely consequence for formating of the DNA source file:</p> <p>Given that any whitespace on the left will be ignored, we can add arbitrary whitespace in each function to re-synchronise divergent indenting between the control language and the generated language:</p> <div class="code"> <pre> <code>if(a == 0) .i += 1; if(b == 0) .i += 2; if(c == 0) .i += 3; if(d == 0) .i += 4; end end end end</code> </pre></div> <p>In lines 7 and 8, the control language is mis-aligned with the gererated language by 12 characters. Now consider what happens if we split the code into two functions:</p> <div class="code"> <pre> <code>def foo() . i += 2; if(c == 0) . i += 3; if(d == 0) . i += 4; end end end if(a == 0) .i += 1; if(b == 0) .@{foo} end end</code> </pre></div> <p>As can be seen, the divergence doesn't exceed 4 characters here.</p> <h3><span>25.</span></h3> <p>Finally, few words for the biologically savvy.</p> <p>Of course, besides being an vague allusion to Ruby, ribosome's name refers to the molecular machine within the living cell, one that translates RNA into proteins.</p> <p>And of course, ribosomes process RNA, not DNA.</p> <p>So, if we want to stretch the biological metaphor further, ribosome tool plays role of both RNA polymerase and the ribosome itself. In fact, the tool works in two steps (invisible to the user):</p> <ol> <li>First step compiles .dna file into a code generator written in pure ruby (it has .rna extension!)</li> <li>Second step runs the code generator to produce the actual generated code.</li> </ol> <p><strong>Martin Sústrik, April 27th, 2014</strong></p> <div style="height:2em"></div> </div> <p>by <span class="printuser avatarhover"><a href="http://www.wikidot.com/user:info/martin-sustrik" ><img class="small" src="http://www.wikidot.com/avatar.php?userid=939&amp;amp;size=small&amp;amp;timestamp=1438465175" alt="martin_sustrik" style="background-image:url(http://www.wikidot.com/userkarma.php?u=939)" /></a><a href="http://www.wikidot.com/user:info/martin-sustrik" >martin_sustrik</a></span></p> 
				 	]]>
				</content:encoded>							</item>
					<item>
				<guid>http://250bpm.com/blog:37</guid>
				<title>Code Generation &amp; Visual Smog in Code (part I)</title>
				<link>http://250bpm.com/blog:37</link>
				<description>

&lt;p&gt;[[div style=&amp;quot;width:50em&amp;quot;]]&lt;/p&gt;
&lt;p&gt;by &lt;span class=&quot;printuser avatarhover&quot;&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;&lt;img class=&quot;small&quot; src=&quot;http://www.wikidot.com/avatar.php?userid=939&amp;amp;amp;size=small&amp;amp;amp;timestamp=1438465175&quot; alt=&quot;martin_sustrik&quot; style=&quot;background-image:url(http://www.wikidot.com/userkarma.php?u=939)&quot; /&gt;&lt;/a&gt;&lt;a href=&quot;http://www.wikidot.com/user:info/martin-sustrik&quot;  &gt;martin_sustrik&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
</description>
				<pubDate>Mon, 21 Apr 2014 12:12:24 +0000</pubDate>
												<content:encoded>
					<![CDATA[
						 <div style="width:50em"> <h3><span>1.</span></h3> <p>I am reading that code generation &#8212; then called &quot;automatic programming&quot; &#8212; originally referred to the automation of the task of punching the paper tape. Later on it became to mean generation of the real code from a higher level language like, say, Fortran.</p> <p>Today it's not longer about creating full-blown programming languages. It's still about compilers, but rather about compilers for ad hoc, domain-specific and thow-away languages.</p> <h3><span>2.</span></h3> <p>Technically, to implement a compiler one needs two major parts: First, a general-purpose language to write the compiler in. Second, a parser.</p> <p>Traditionally, compilers are implemented using C as an implementation language and tools like <a href="https://en.wikipedia.org/wiki/Yacc">yacc</a> to do the parsing part. However, if you need to write a throw-away compiler in two minutes, none of those fares particularly well. Both C and yacc are hard, time-consuming and error-prone to use.</p> <p>Instead, you want a powerful, easy to use scripting language with an extensive set of libraries, something like Perl, Python or Ruby.</p> <p>As for the parsing, you don't want to create a new language from scratch. Two minutes is not enough time to do that. Instead, you want a well-known shrinkwrapped language that allows for creating arbitrarily complex structures, while making no assumptions about the semantics. In short, we want JSON, YAML or XML.</p> <h3><span>3.</span></h3> <p>Given that both the implementation languages and the parsers and widely-available, free and easy to use, writing a code generator is a technically trivial task.</p> <p>So, is there any need for a specialised tool for code generation in the end? Why not simply parse an XML file from Python?</p> <p>And the answer is: Because readability.</p> <p>Source code of code generator is notoriously ugly and hard to comprehend. And the code that cannot be read, cannot be maintained, fixed or shared with others. Any code generation tool should thus put the main focus on readability.</p> <h3><span>4.</span></h3> <p>Me being a C programmer, let's have a look at C example:</p> <div class="code"> <pre> <code>printf (&quot;Hello, world!\n&quot;);</code> </pre></div> <p>Now let's write code that will generate the code above:</p> <div class="code"> <pre> <code>printf (&quot;printf (\&quot;Hello, world!\\n\&quot;);&quot;);</code> </pre></div> <p>And code that will generate the above generator:</p> <div class="code"> <pre> <code>printf (&quot;printf (\&quot;printf (\\\&quot;Hello, world!\\\\n\\\&quot;);\&quot;);&quot;);</code> </pre></div> <p>Yuck!</p> <h3><span>5.</span></h3> <p>Let me introduce a new term here: By &quot;visual smog&quot; I mean graphical elements in the code that are needed from syntactical point of view, but which don't contribute in any way to expressing the programmer's intent.</p> <div class="code"> <pre> <code>printf (&quot;printf (\&quot;printf (\\\&quot;Hello, world!\\\\n\\\&quot;);\&quot;);&quot;);</code> </pre></div> <p>The intent of the above code is &quot;<em>generate a program that will generate a program that will print a string to stdout</em>&quot;. However, around 50% of the code is not helpful in coveying the message and rather gets into the way. The amount of visual smog in the example is extremely high.</p> <h3><span>6.</span></h3> <p>As far as I can say, visual smog in code generators emanates from three major sources:</p> <ol> <li>Repetitive code</li> <li>Escape sequences</li> <li>Inconsistent indentation</li> </ol> <h3><span>7.</span></h3> <p>Code like the one below contains a lot od visual smog:</p> <div class="code"> <pre> <code>printf (&quot;#include &lt;stdio.h&gt;\n&quot;); printf (&quot;\n&quot;); printf (&quot;int main () {\n&quot;); printf (&quot; printf (&quot;Hello, world!\\n&quot;);\n&quot;); printf (&quot; return 0;\n&quot;); printf (&quot;}\n&quot;);</code> </pre></div> <p>Most of it is created by the repetitive printf statement. In past couple of weeks I've played with code generation and, in the process, I've written a simple code generation tool called <a href="https://github.com/sustrik/ribosome">ribosome</a>. There, I've decided to replace the leading print statement by a single typographically lightweight and easy-to-type character:</p> <div class="code"> <pre> <code>.#include &lt;stdio.h&gt; . .int main () { . printf (&quot;Hello, world!&quot;); . return 0; .}</code> </pre></div> <h3><span>8.</span></h3> <p>The control language in the case of ribosome is JavaScript and Ruby rather than C. I've discarded Perl for it's cryptic syntax &#8212; we are trying to improve the readability here, mind you! &#8212; and Python because its indenting rules clash with the requirements for a code generation tool. In the end, the mix of control language and generated language looks something like this:</p> <div class="code"> <pre> <code>for i in 1..5 .printf (&quot;Hello, world!\n&quot;); end</code> </pre></div> <p>Which, of course, generates the following code:</p> <div class="code"> <pre> <code>printf (&quot;Hello, world!\n&quot;); printf (&quot;Hello, world!\n&quot;); printf (&quot;Hello, world!\n&quot;); printf (&quot;Hello, world!\n&quot;); printf (&quot;Hello, world!\n&quot;);</code> </pre></div> <p>To understand my comment about Python above, consider that the dot (.) must be the first character in the line. If it was possible to put the dot operator anywhere on the line, we would have to escape any other dot characters in the generated code:</p> <div class="code"> <pre> <code> .System\.out\.println (&quot;Hello, world!\n&quot;);</code> </pre></div> <br /> Assuming that that's not an option, let's imagine the hello world example above with Python as a control language: <div class="code"> <pre> <code>for i in range(1,5): .printf (&quot;Hello, world!\n&quot;);</code> </pre></div> <p>Given the Python's syntax, now we have to way to express whether the dot statement should be a part of the loop or whether it should be executed once after the loops ends.</p> <p>By the way, <a href="http://www.makotemplates.org/">Mako</a> code generator uses Python as the control language, but has to add extra syntactic costructs to the language:</p> <div class="code"> <pre> <code>% for name in row: &lt;td&gt;${name}&lt;/td&gt;\ % endfor</code> </pre></div> <h3><span>9.</span></h3> <p>The next interesting question is: Why do we prepend the lines of output by dot rather than the lines of Ruby code? Why not the other way round? Or, for what it's worth, why not use two different prefix characters to distinguish between the two?</p> <p>There are two answers. One of them is kind of trivial, the other one is more philosphical:</p> <p>First: We know that dots at the beginning of the line are quite rare in Ruby. Even if dot happens to occur at the beginning of the line it should be easy to get rid of it by simply adding some whitespace. Thus, there's no much space for clashes, where a Ruby line would be accidentally mistaken for an output line. However, this reasoning doesn't work the other way round: We don't know anything about the language being generated. It may be C or Fortran or Lisp. It may be something completely different. We can't guarantee that there won't be dots at the beginnings of the lines.</p> <p>In the worst case we would be generating a ribosome source file (meta-code-generation) which would mean a lot of dots at the beginnings of the lines. We would have to introduce escape sequences. And the least thing anyone wants in a code generator is more escape sequences.</p> <p>Second reason: Most code generation tools use the paradigm of the &quot;template&quot;. What it means is that you consider the source file to be a template that contains the destination code in almost the final form and the code generation tool is used only to replace few selected snippets in the file by generated values:</p> <div class="code"> <pre> <code>#include &lt;stdio.h&gt; int main() { printf (&quot;Hello, $(name)!\n&quot;); }</code> </pre></div> <p>The template paradigm may work well for a simple function like <em>printf</em> (and maybe it works OK for generating web pages), but once complex looping, branching and function invocation is added to the mix, the template doesn't look like a template anymore. It looks like what it really is, namely, a program that generates another program.</p> <p>Thus, I believe, the user approaches the code generation tool with the intent of writing a code-generating code, not a template. And in ribosome I've done my best to accommodate that mindset.</p> <p>What it means in practice is that user can start writing Ruby code straight away with no extra syntactic baggage. They can even copy existing Ruby code to a ribosome source file and expect it to work out of the box!</p> <p>And that, of course, means that lines of Ruby cannot have any extra prefixes.</p> <h3><span>10.</span></h3> <p>As already mentioned, escape sequences are another ample source of visual smog. The only way to fully avoid escape sequnces is to use prefixes &#8212; which is exactly what ribosome does with the dot notation. The dot has special meaning (&quot;pass the line to the output&quot;) only if it occurs as the first charcter in the line. The rest of the line can thus contain arbitrary characters, including dot, without a need for escape sequences:</p> <div class="code"> <pre> <code>.System.out.println (&quot;Hello, world!\n&quot;);</code> </pre></div> <h3><span>11.</span></h3> <p>Ribosome could have done with no escape sequences altogether, were it not for the need to embed ruby expressions into the generated code:</p> <div class="code"> <pre> <code>name = &quot;Alice&quot; .printf (&quot;Hello, @{name}!\n&quot;);</code> </pre></div> <p>What I've tried to do with @{} notation (as well as with the three other remaining operators) was:</p> <ol> <li>Use two characters for the operator, not a single one.</li> <li>Choose an unlikely combination of characters, rather than a common one.</li> </ol> <p>The very fact of using two characters as an operator lowers the chance of collision with the generated text quadratically. Using uncommon two-character sequences lowers the probabilty of collision to almost zero.</p> <h3><span>12.</span></h3> <p>There's one exception to the above reasoning: When generating ribosome source file by ribosome (meta-level) the character sequences representing ribosome operators won't be all that rare. Therefore, I've created a dedicated construct to deal with such meta-meta-meta issues.</p> <p>Specifically, I've introduced a &quot;nested expression of N-th level&quot; operator. The idea is that the compilation of an expression of N-th level yields an expression of (N-1)-th level, until, in the end, compilation of 1st level nested expression yields the computed value of the expression:</p> <div class="code"> <pre> <code>@3{x} =&gt; @2{x} =&gt; @{x} =&gt; x</code> </pre></div> <p>For those who enjoy meta-programing, here's an practical example:</p> <div class="code"> <pre> <code>.name = &quot;Alice&quot; ..Hello, @2{name}!</code> </pre></div> <p>First compilation results in:</p> <div class="code"> <pre> <code>name = &quot;Alice&quot; .Hello, @{name}!</code> </pre></div> <p>And the second one yields:</p> <div class="code"> <pre> <code>Hello, Alice!</code> </pre></div> <h3><span>13.</span></h3> <p>In the rare case where we need to generate @{ sequence without the result actually being an ribosome input file we must finally resort to a proper escape sequence. These are implemented as simple Ruby functions:</p> <div class="code"> <pre> <code>.In the rare case where we need to generate @{at}{ sequence without...</code> </pre></div> <p><strong>To be continued&#8230;</strong></p> <p><strong>Martin Sústrik, April 21st, 2014</strong></p> <div style="height:2em"></div> </div> <p>by <span class="printuser avatarhover"><a href="http://www.wikidot.com/user:info/martin-sustrik" ><img class="small" src="http://www.wikidot.com/avatar.php?userid=939&amp;amp;size=small&amp;amp;timestamp=1438465175" alt="martin_sustrik" style="background-image:url(http://www.wikidot.com/userkarma.php?u=939)" /></a><a href="http://www.wikidot.com/user:info/martin-sustrik" >martin_sustrik</a></span></p> 
				 	]]>
				</content:encoded>							</item>
				</channel>
</rss>