<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Aris&#039;s Blog</title>
	<atom:link href="http://blog.0xbadc0de.be/feed" rel="self" type="application/rss+xml" />
	<link>http://blog.0xbadc0de.be</link>
	<description>Computers, ssh and rock&#039;n roll</description>
	<lastBuildDate>Mon, 07 Jun 2010 21:35:21 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Threading design pattern ?</title>
		<link>http://blog.0xbadc0de.be/archives/46</link>
		<comments>http://blog.0xbadc0de.be/archives/46#comments</comments>
		<pubDate>Mon, 07 Jun 2010 21:33:21 +0000</pubDate>
		<dc:creator>Aris Adamantiadis</dc:creator>
				<category><![CDATA[libssh]]></category>

		<guid isPermaLink="false">http://blog.0xbadc0de.be/?p=46</guid>
		<description><![CDATA[When designing the new libssh architecture, we decided to make it completely callback-based, even internally. This provide cool features, like the possibility to extend libssh without recompiling it, or executing more easily asynchronous or nonblocking functions. Libssh 0.5 will run as a state machine, which listen for packets, and then calls callbacks from the packet [...]]]></description>
			<content:encoded><![CDATA[<p>When designing the new <a href="http://www.libssh.org">libssh</a> architecture, we decided to make it completely callback-based, even internally. This provide cool features, like the possibility to extend libssh without recompiling it, or executing more easily asynchronous or nonblocking functions. Libssh 0.5 will run as a state machine, which listen for packets, and then calls callbacks from the packet type. The handlers will evaluate the current state of the session and what to do with packets. Sometimes, the state of the session itself changes as the result of a packet (e.g. when receiving a SSH_AUTH_SUCCESS packet).</p>
<p>A sequence diagram of a synchronous function such as authentication or simple channel read can be systematized as following:</p>
<p><a href="http://blog.0xbadc0de.be/wp-content/uploads/2010/06/monothread.png"><img class="alignnone size-full wp-image-47" title="monothread" src="http://blog.0xbadc0de.be/wp-content/uploads/2010/06/monothread.png" alt="" width="603" height="530" /></a></p>
<p>What&#8217;s happening is pretty straightforward. The thread X is waiting for a specific packet x (or more precisely, the internal state change caused by packet x). It calls a function called ProcessInput (this is a simplification) which itself locks itself and tries to read packets from the socket. After a packet has been read, the callback for the packet (in this case, x) is called, which updates the internal state of the session.</p>
<p>ProcessInput returns after reading one packet. X verifies that the state changes to x, otherwise, it simply tries a ProcessInput again (not on the drawing) until it receives a state change it can process.</p>
<h1>Then, what&#8217;s the problem ?</h1>
<p>I though that this design could provide an interesting feature to libssh users. By adding a lock in the ProcessInput function (already on the previous drawing), we could let applications call different libssh functions on a same session, simultaneously, in different threads. Thread X could be doing a blocking ssh_channel_read() on one channel while thread Y could be doing the same on another. A naive implementation of locking would give this result :<a href="http://blog.0xbadc0de.be/wp-content/uploads/2010/06/twothreads_simplelock.png"><img class="alignnone size-full wp-image-50" title="twothreads_simplelock" src="http://blog.0xbadc0de.be/wp-content/uploads/2010/06/twothreads_simplelock.png" alt="" width="821" height="990" /></a></p>
<p>This sequence diagram is a simple extension of the previous one. Thread X waits for a packet x, and thread Y wait its turn by using a lock (or semaphore) in ProcessInput(Y). Looks great, except there&#8217;s a downfall. This exemple does not work if the first packet to show up is not an x packet :</p>
<p><a href="http://blog.0xbadc0de.be/wp-content/uploads/2010/06/problematic-scenario.png"><img class="alignnone size-full wp-image-51" title="problematic scenario" src="http://blog.0xbadc0de.be/wp-content/uploads/2010/06/problematic-scenario.png" alt="" width="860" height="889" /></a></p>
<p>In this example, the x packet never arrives. the ProcessInput called by the X thread receives the y packet and do process it (after all, all threads can manipulate the internal state of the session). The problem is that after ProcessInput has processed the y packet (and left X unhappy and looping in hope of receiving the x packet), ProcessInput(Y)&#8217;s lock was released, and Y is doing a packet read, which can be blocking and make the Y thread wait for a moment. This is unfortunate because Y was in the correct state y before calling ReadPacket. Unfortunately, ProcessInput is meant to be generic enough and doesn&#8217;t know anything about the x or y states.</p>
<p>I&#8217;m looking for some kind of design pattern, or elegant solution to resolve this problem, by those who already resolved this problem before me.</p>
<h1>Potential solution ?</h1>
<p>I have though of two solutions:</p>
<ul>
<li>A lock would &#8220;remember&#8221; it was blocked by another thread, would wait until the lock is free and then directly return to the caller. This way, our Y thread would not run the potentially blocking ReadPacket function, in the case thread X made the hard work for him. In the opposite case (our second example), Y would call ProcessInput a second time and catch the y packet soon or later.<br />
Unfortunately, I do not see an elegant way of doing this with the common denominator building blocks of pthread and win32 threads. It doesn&#8217;t look like an elegant solution to me, but it complies with the specification &#8220;ProcessInput returns when at least one packet has been read since the call for ProcessInput&#8221;.</li>
<li>A received packet counter would be read at the start of the ProcessInput function and stored in the local stack. The packet counter would then be incremented each time a packet is received in the session, and after entering in the critical area of ProcessInput, the values would be compared. If it changed, ProcessInput would return.<br />
I suspect this scenario is vulnerable to races between the moment the function is called and the counter is read. Nothing tells us that another thread did not just finish to read the y packet before we initialize the local packet counter.</li>
<li>The ProcessInput function would take an additional parameter which would help it to tell if the ReadPacket function is still worth being called. This could be a callback to be called just after acquiring the lock. For instance, Y could call ProcessInput with a specific callback function check_y() which checks if the status has changed by action of the y packet. This function could also be called by the Y function itself in the ProcessInput loop, since it&#8217;s somewhat the termination condition for the loop.<br />
As a drawback, I think this solution adds additional binding between the ProcessInput function and the potential callers (there are hundreds of them in libssh master) and may add too much complexity.</li>
</ul>
<p>What&#8217;s your opinion ? Feel free to comment !</p>
<p style="text-align: right;">Aris</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.0xbadc0de.be/archives/46/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Debugging a cryptographic bug in libssh&#8230;</title>
		<link>http://blog.0xbadc0de.be/archives/15</link>
		<comments>http://blog.0xbadc0de.be/archives/15#comments</comments>
		<pubDate>Sat, 17 Apr 2010 18:07:36 +0000</pubDate>
		<dc:creator>Aris Adamantiadis</dc:creator>
				<category><![CDATA[libssh]]></category>

		<guid isPermaLink="false">http://blog.0xbadc0de.be/?p=15</guid>
		<description><![CDATA[Hey there, you may know I am a developer of the SSH Library libssh. Last week, a post on the libssh mailing list was reporting a connection problem under Redhat RHEL 4.8. It seemed that the new cipher aes128-ctr, recently implemented in libssh, had a little problem&#8230;
This bug looked strange, firstly because we never ever [...]]]></description>
			<content:encoded><![CDATA[<p>Hey there, you may know I am a developer of the <a href="http://www.libssh.org">SSH Library</a> libssh. Last week, a <a href="http://www.libssh.org/archive/libssh/2010-04/0000022.html">post</a> on the libssh mailing list was reporting a connection problem under Redhat RHEL 4.8. It seemed that the new cipher aes128-ctr, recently implemented in libssh, had a little problem&#8230;</p>
<p>This bug looked strange, firstly because we never ever had any cryptographic problems within libssh, secondly because the debugging did not report something broken :<br />
<code><br />
[3] Set output algorithm to aes256-ctr<br />
[3] Set input algorithm to aes256-ctr</code></p>
<p>[3] Writing on the wire a packet having 17 bytes before<br />
[3] 17 bytes after comp + 10 padding bytes = 28 bytes packet<br />
[3] Encrypting packet with seq num: 3, len: 32<br />
[3] Sent SSH_MSG_SERVICE_REQUEST (service ssh-userauth)<br />
[3] Decrypting 16 bytes<br />
[3] Packet size decrypted: 44 (0&#215;2c)<br />
[3] Read a 44 bytes packet<br />
[3] Decrypting 32 bytes<br />
2010-04-12 13:14:54,211557; 1126189408 procSrvAuth;  Did not receive SERVICE_ACCEPT</p>
<p>While giving on the OpenSSH side :<br />
<code><br />
sshd[22341]: debug1: SSH2_MSG_NEWKEYS sent<br />
sshd[22341]: debug1: expecting SSH2_MSG_NEWKEYS<br />
sshd[22341]: debug1: SSH2_MSG_NEWKEYS received<br />
sshd[22341]: debug1: KEX done<br />
sshd[22341]: Disconnecting: Corrupted MAC on input.<br />
</code></p>
<h2>What does this mean ?</h2>
<p>libssh was sending garbage and did receive some kind of garbage (a variation of the last error showed a HMAC error). However, the &#8220;size&#8221; field of the SSH packet (the first 32 bits of every packet) was consistent with the type of packet being received. So what ?<br />
Further analysis of the received plaintext on both openssh and libssh showed that the first 16 bytes of the first packet in each direction were correct. So, this was a bug that was affecting the whole stream excepted the first block of blocksize bytes. The code in libssh producing aes128-ctr is the following:<br />
<code><br />
static void aes_ctr128_encrypt(struct crypto_struct *cipher, void *in, void *out,<br />
unsigned long len, void *IV) {<br />
unsigned char tmp_buffer[128/8];<br />
unsigned int num=0;<br />
/* Some things are special with ctr128 :<br />
* In this case, tmp_buffer is not being used, because it is used to store temporary data<br />
* when an encryption is made on lengths that are not multiple of blocksize.<br />
* Same for num, which is being used to store the current offset in blocksize in CTR<br />
* function.<br />
*/<br />
AES_ctr128_encrypt(in, out, len, cipher-&gt;key, IV, tmp_buffer, &amp;num);<br />
}</code></p>
<h2>Then, how does aes-ctr work ?</h2>
<p><img class="alignnone" title="CTR encryption" src="http://upload.wikimedia.org/wikipedia/commons/3/3f/Ctr_encryption.png" alt="" width="601" height="242" /></p>
<p>CTR is a stream cipher mode build on top of a block cipher. In SSH, it&#8217;s used like a block cipher anyway. It has two interesting characteristics:</p>
<ul>
<li>The same code is used for encryption and decryption, because it produces a OTP-like stream of bytes</li>
<li>The key is used for the block cipher encryption and the input to the algorithm is a nounce together with a counter</li>
</ul>
<p>That&#8217;s where things begin to be interesting. In our code, IV is used as a nounce and is generated from the cryptographic parameters during the key exchange. I have verified its initial value was consistent with the valid (working) implementation. tmp_buffer is a buffer used for internal operations of the cipher. It&#8217;s normally not important. The num variable is used to report how far we are in the encryption of the local block, in order to emulate a stream cipher. We are not using this feature (SSH always encrypts packets multiple of the blocksize), so the returned value is always 0.</p>
<p>So now, how goes that libssh with OpenSSL 0.9.8 on my desktop and OpenSSH on RHEL 4.8 work like a charm, and libssh with OpenSSL 0.9.7a on RHEL/CentOS 4.8 does not ?</p>
<p>I had to go one step further and look what could be wrong in the way I am using the AES_ctr128_encrypt function. I looked at the code of OpenSSL 0.9.8 and found this:<br />
<code><br />
* increment counter (128-bit int) by 1 */<br />
static void AES_ctr128_inc(unsigned char *counter) {<br />
unsigned long c;</code></p>
<p>/* Grab bottom dword of counter and increment */<br />
c = GETU32(counter + 12);<br />
c++;    c &amp;= 0xFFFFFFFF;<br />
[...]</p>
<p>This is the code used to increment the counter. And now the surprise in the sources of OpenSSL 0.9.7a :<br />
<code><br />
/* increment counter (128-bit int) by 2^64 */<br />
static void AES_ctr128_inc(unsigned char *counter) {<br />
unsigned long c;</code></p>
<p>/* Grab 3rd dword of counter and increment */<br />
#ifdef L_ENDIAN<br />
c = GETU32(counter + 8);<br />
[...]</p>
<p>What does that mean ? It means that the counter incrementation is not the same between the two versions of AES-CTR128 ! OpenSSL has a bad and a correct version of the implementation of AES-CTR128. You can find that the <a href="http://osdir.com/ml/encryption.openssl.cvs/2003-07/msg00001.html">CVS commit</a> fixing this dates back from 2003. I found that OpenSSL 0.9.7c fixes the issue. Of course, no documentation explains that difference and nothing in the header files let you know if you&#8217;re in front of a broken or working implementation (I would have expected a #define in the working version).</p>
<p>By studying the sources of OpenSSH, I found that they were not affected by the bug because they implemented the CTR encoding by their own. Not wanting to do this, I simply deactivated the compilation of the CTR algorithms on libssh on broken OpenSSL. Yop, &#8220;Fixed!&#8221;.</p>
<h2>Lessons learned</h2>
<p>These things are important when you&#8217;re debugging a cryptographic thing that produces garbage:</p>
<ul>
<li>Verify the input. Garbage in, garbage out</li>
<li>Verify the derivative input like IV. Even a single error of one bit can change the output to garbage</li>
<li>Verify the output. It&#8217;s possible that the output of the cryptographic algorithm you&#8217;re using is good, it&#8217;s just not what the other party is reading and trying to decrypt &#8230;</li>
<li>Verify you&#8217;re using correctly the crypto. When the only doc you have, like with libcrypto, is the header file, then read the source.</li>
<li>If all of this did not work&#8230; read the source of the crypto and find what&#8217;s the difference between the working implementation and the wrong implementation. Maybe it&#8217;s something you did not understand and used wrongly, maybe &#8230;</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://blog.0xbadc0de.be/archives/15/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>My new blog</title>
		<link>http://blog.0xbadc0de.be/archives/4</link>
		<comments>http://blog.0xbadc0de.be/archives/4#comments</comments>
		<pubDate>Mon, 08 Feb 2010 16:47:25 +0000</pubDate>
		<dc:creator>Aris Adamantiadis</dc:creator>
				<category><![CDATA[libssh]]></category>
		<category><![CDATA[mylife]]></category>
		<category><![CDATA[work]]></category>

		<guid isPermaLink="false">http://blog.0xbadc0de.be/?p=4</guid>
		<description><![CDATA[Hi there !
After some solicitation from third parties (read: libssh developers), I finally installed a real blog to replace more or less my wiki-based website. I&#8217;m going to discuss some things about libssh development, thoughts about programming, networks, computer security and internet.
I&#8217;ll take the occasion to tell about the FOSDEM convention that took place this [...]]]></description>
			<content:encoded><![CDATA[<p>Hi there !</p>
<p>After some solicitation from third parties (read: libssh developers), I finally installed a real blog to replace more or less my wiki-based website. I&#8217;m going to discuss some things about <a href="http://www.libssh.org">libssh</a> development, thoughts about programming, networks, computer security and internet.</p>
<p>I&#8217;ll take the occasion to tell about the <a href="http://www.fosdem.org">FOSDEM</a> convention that took place this week-end. Awesome meeting of opensource developers, it was a great occasion to meet people, in particular <a href="http://blog.cynapses.org/">Andreas </a>which is a libssh developer as well. It&#8217;s also interesting to note that this is the fist time FOSDEM is connected to the IP world through <a href="http://www.belnet.be">BELNET</a> (note: I work for BELNET). FOSDEM was connected to BELNET on IPv4 and IPv6 using fiber, and this bandwidth was distributed to users through WiFi access points distributed in the ULB campus. The <a href="http://stdio.be/onsite.fosdem.net/">bandwidth</a> used peaked around 100 Mbit/s (a poor 1/10th of the available bandwidth) due to technical limitation of airwaves, but I&#8217;m sure the tech staff will find a solution for next year. Also interesting to note, an IPv4 was available for each participant (two /19 were allocated) and the IPv6/IPv4 enabled ratio was around 90%.</p>
<p>I&#8217;m looking forward for more good stuff next year !</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.0xbadc0de.be/archives/4/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
