How Google Authenticator Works

Most people use Google Authenticator to generate two-factor authentication (2FA) tokens on their phone, with Authy as a recent alternative. What’s cool is that any service can make use of these apps as long as it is compatible. But what does it mean to be compatible? How do these apps work?

https://i0.wp.com/www.geeky-gadgets.com/wp-content/uploads/2013/09/Google-Authenticator.jpg

Apps like Google Authenticator implement the Time-Based One-Time Password (TOTP) algorithm. It has the following ingredients:

  • A shared secret (a sequence of bytes)
  • An input derived from the current time
  • A signing function

Shared Secret

The shared secret is what you need to obtain to set up the account on your phone. Either you take a photo of a QR code using your phone or you can enter the secret manually. Because not all byte values are displayable characters the secret is base32-encoded (why not base64?).

For manual entry Google’s services present this secret has the following format:

xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx

This value is 256 bits but can be smaller for other services. The QR code contains this same token as a URL:

otpauth://totp/Google%3Ayourname@gmail.com?secret=xxxx&issuer=Google

Input (Current Time)

The input time value you’ll simply get from your phone, no further interaction with the server is required once you have obtained the secret. However it is important that your phone’s time is accurate as the server will essentially repeat what happens on your phone using the current time as known by the server.

More specifically the server will actually compare submitted tokens to all tokens generated for a window of time (e.g. a couple of minutes) to account for the time it takes for you to type the token and send it to the server.

Signing Function

The signing function used is HMAC-SHA1. HMAC stands for Hash-based message authentication code and it is an algorithm that uses a secure one-way hash function (SHA1 in this case) to sign a value. Using an HMAC allows us to verify authenticity – only people knowing the secret can generate the same output for the same input (the current time). This all sounds complex but the algorithm is very simple (details omitted):

hmac = SHA1(secret + SHA1(secret + input))

As an aside TOTP is in fact a superset of HOTP or HMAC-Based One-Time Password Algorithm – they are the same thing except that TOTP specifies that the current time is used as the input value while HOTP simply uses an incrementing counter that needs to be synchronized.

Algorithm

First we’ll need to base32 decode the secret. Google presents it with spaces and in lowercase to make it easier to grok for humans, but base32 actually does not allow spaces and only allows uppercase letters. Thus:

original_secret = xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx
secret = BASE32_DECODE(TO_UPPERCASE(REMOVE_SPACES(original_secret)))

Next we derive the input from the current time, for this we’ll use UNIX time, or the amount of seconds since the epoch:

input = CURRENT_UNIX_TIME()

One thing you have probably noticed in Google Authenticator is that codes are valid for some time before changing to the next value. If the value would change every second it would be a bit difficult to copy, after all. This value defaults to 30 seconds, we can simply do an integer divide by 30 to get a value that will remain stable in a 30 second time window. We don’t really care if the value has a particular scale, as long as the value is reproducible on both sides.

input = CURRENT_UNIX_TIME() / 30

Finally we apply the signing function, HMAC-SHA1:

original_secret = xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx
secret = BASE32_DECODE(TO_UPPERCASE(REMOVE_SPACES(original_secret)))
input = CURRENT_UNIX_TIME() / 30
hmac = SHA1(secret + SHA1(secret + input))

Now, we could be done here as what we have so far will provide effective 2FA. However the resulting HMAC value is a standard-length SHA1 value (20 bytes, 40 hex characters) and nobody wants to type 40 characters. We want to those pretty 6-digit numbers!

To convert the 20-byte SHA1 to a 6-digit number we’ll first slim it down a bit. We will use the last 4 bits of the SHA1 (a value ranging from 0-15) to index into the 20-byte value and use the next 4 bytes at that index. The maximum potential value of this indexing operation is 15 + 4 = 19, which is also the maximum index possible (remember, zero-based) so that will always work. So anyway, we get those 4 bytes:

four_bytes = hmac[LAST_BYTE(hmac):LAST_BYTE(hmac) + 4]

We can now turn these into a standard 32 bit unsigned integer (4 bytes = 32 bit).

large_integer = INT(four_bytes)

Now we have a number, much better! However as the name suggests, this could still be a very large value (2^32 – 1), and that would obviously not be a 6 digit number. We can guarantee a 6-digit number by using the remainder of dividing by the first 7 digit number. Which is one million.

large_integer = INT(four_bytes)
small_integer = large_integer % 1,000,000

This is our final value. Here’s everything together:

original_secret = xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx
secret = BASE32_DECODE(TO_UPPERCASE(REMOVE_SPACES(original_secret)))
input = CURRENT_UNIX_TIME() / 30
hmac = SHA1(secret + SHA1(secret + input))
four_bytes = hmac[LAST_BYTE(hmac):LAST_BYTE(hmac) + 4]
large_integer = INT(four_bytes)
small_integer = large_integer % 1,000,000

 

 

Advertisements

Reach Robotics closes $7.5M Series A for its augmented reality bots

After years of research and development, Reach Robotics has closed a $7.5 million Series A, co-led by Korea Investment Partners (KiP) and IGlobe, to bring its augmented reality bots to market in a big way. The Bristol-based startup is looking to expand into the U.S., and the team is exploring opportunities for growth into other European and Asian markets.

Reach Robotics’ first product, MekaMon, launched last fall. Today’s round comes after the company produced and sold an initial run of 500 of its four-legged, crab-like, bots. MekaMon fits into an emerging category of smartphone-enabled augmented reality toys like Anki.

Silas Adekunle, CEO of Reach Robotics, tells me the influx of capital will be used to make some strategic hires and increase brand recognition through marketing. This is the first time the startup has announced a funding round. Adekunle tells me his experience raising capital wasn’t easy; as they say, hardware is hard.

“It was hard to pitch in our early days because people didn’t believe,” explained Adekunle.

MekaMon sits somewhere between toy and full-fledged robot. Unlike the radio-controlled RadioShack robots of yesteryear, MekaMon costs a hefty $329. At first glance this can be hard to swallow, but Adekunle remains adamant that he is building a platform and not a line of toys — think PS4 instead of an expensive, single-use robot collecting dust on a shelf.

Outside of retail sales, another avenue for the company to make money is through partnerships within the entertainment industry. Adekunle says that Reach would never go out of its way to deliver a specific product for a client, but he always keeps an eye out for overlap where a partnership could occur with minimal operational changes.

“People are taken aback that something could be this realistic,” asserts Adekunle. “If you strip back the product and lose that, then you don’t have an innovative company.”

Because Reach is selling software-enabled hardware, it has the opportunity to collect all sorts of interesting data that it can use to fine-tune its products. The startup is able to track retention in aggregate and look at how people actually use their robots. Moreover, if MekaMon suffers leg failure, Reach can analyze indicators like temperature readings and torque.

Adekunle insists on keeping the Reach Robotics team interdisciplinary — one employee helped shape the way robots move in the Transformers movie series. This same team is focused on empowering the next group of developers who will build on the MekaMon platform and create new use cases, beyond the company’s initial vision for the product.

Marvel is bringing its superheroes to VR with a new Oculus-exclusive game

The Incredible Hulk and some of his other box office money-grabbing super pals will be coming to the world of virtual reality.

Marvel Powers United VR, announced at Disney’s D23 event on Saturday, will allow players a chance to step into the shoes of some familiar heroes as they destroy lots of stuff in VR.

Powers United VR, an Oculus-exclusive, looks pretty similar to existing VR wave shooters like Robo Recall, though its multi-player could spice things up a bit. The main highlight will obviously be having IP from Marvel; players will be able to choose from 12 Marvel characters as they exact righteous mayhem.

The title is being developed by Sanzaru Games, which has already done a couple of VR titles for the Rift, including VR Sports Challenge and Ripcoil.

Facebook and Oculus have devoted $500 million to funding made-for-VR content. Oculus has been doing so largely with the hopes of attracting exclusives and interest from top AAA game publishers who have been reticent to invest significant cash into a space with so few users relative to console and PC audiences.

With Marvel, Oculus has found a partnership that allows it another big-name exclusive to show off its highest-end Rift and Touch controller hardware, which it has heavily discounted in recent months as Facebook looks to sell units and keep up with competition in the niche VR space.

Building a hefty library of exclusives is even more important to the company following E3, where Oculus was largely overlooked as the highly influential ZeniMax-owned Bethesda announced a number of titles from blockbuster series, including DOOM, Fallout and The Elder Scrolls, that it will be porting to competing virtual reality systems like HTC’s Vive and Sony’s PlayStation VR. This comes as Facebook fights an injunction from the Oculus/ZeniMax lawsuit, for which it has already been ordered to pay up a half-billion dollars.

Marvel Powers United VR is slated for a 2018 release.

How RSA public key encryption works

RSA is an algorithm used by modern computers to encrypt and decrypt messages. It is an asymmetric cryptographic algorithm. Asymmetric means that there are two different keys. This is also called public key cryptography, because one of them can be given to everyone. The other key must be kept private. It is based on the fact that finding the factors of an integer is hard (the factoring problem). RSA stands for Ron Rivest, Adi Shamir and Leonard Adleman, who first publicly described it in 1978. A user of RSA creates and then publishes the product of two large prime numbers, along with an auxiliary value, as their public key. The prime factors must be kept secret. Anyone can use the public key to encrypt a message, but with currently published methods, if the public key is large enough, only someone with knowledge of the prime factors can feasibly decode the message.

Operation

RSA involves a public key and private key. The public key can be known to everyone, it is used to encrypt messages. Messages encrypted using the public key can only be decrypted with the private key. The keys for the RSA algorithm are generated the following way:

  1. Choose two different large random prime numbers  {\displaystyle p\,} and  {\displaystyle q\,}
  2. Calculate {\displaystyle n=pq\,}
    •   {\displaystyle n\,} is the modulus for the public key and the private keys
  3. Calculate the totient: {\displaystyle \phi (n)=(p-1)(q-1)\,}.
  4. Choose an integer {\displaystyle e\,} such that 1 <  {\displaystyle e\,} < {\displaystyle \phi (n)\,}, and {\displaystyle e\,} is coprime to {\displaystyle \phi (n)\,} ie: {\displaystyle e\,} and {\displaystyle \phi (n)\,} share no factors other than 1; gcd(  {\displaystyle e\,}, {\displaystyle \phi (n)\,}) = 1.
    •   {\displaystyle e\,} is released as the public key exponent
  5. Compute {\displaystyle d\,} to satisfy the congruence relation {\displaystyle de\equiv 1{\pmod {\phi (n)}}\,} ie: {\displaystyle de=1+k\phi (n)\,} for some integer  {\displaystyle k\,}.
    •   {\displaystyle d\,} is kept as the private key exponent

Notes on the above steps:

  • Step 1: Numbers can be probabilistically tested for primality.
  • Step 2: changed in PKCS#1 v2.0 to {\displaystyle \lambda (n)={\rm {lcm}}(p-1,q-1)\,} instead of  {\displaystyle \phi (n)=(p-1)(q-1)\,}.
  • Step 3: A popular choice for the public exponents is  {\displaystyle e\,} = 216 + 1 = 65537. Some applications choose smaller values such as {\displaystyle e\,} = 3, 5, or 35 instead. This is done to make encryption and signature verification faster on small devices like smart cards but small public exponents may lead to greater security risks.
  • Steps 4 and 5 can be performed with the extended Euclidean algorithm; see modular arithmetic.

The public key is made of the modulus  {\displaystyle n\,} and the public (or encryption) exponent  {\displaystyle e\,}.
The private key is made of the modulus  {\displaystyle n\,} and the private (or decryption) exponent  {\displaystyle d\,} which must be kept secret.

  • For efficiency a different form of the private key can be stored:
    •   {\displaystyle p\,} and  {\displaystyle q\,}: the primes from the key generation,
    •   {\displaystyle d\mod (p-1)\,} and {\displaystyle d\mod (q-1)\,}: often called dmp1 and dmq1.
    •   {\displaystyle q^{-1}\mod (p)\,}: often called iqmp
  • All parts of the private key must be kept secret in this form.  {\displaystyle p\,} and {\displaystyle q\,} are sensitive since they are the factors of  {\displaystyle n\,}, and allow computation of {\displaystyle d\,} given  {\displaystyle e\,}. If  {\displaystyle p\,} and  {\displaystyle q\,} are not stored in this form of the private key then they are securely deleted along with other intermediate values from key generation.
  • Although this form allows faster decryption and signing by using the Chinese Remainder Theorem (CRT) it is considerably less secure since it enables side channel attacks. This is a particular problem if implemented on smart cards, which benefit most from the improved efficiency. (Start with {\displaystyle y=x^{e}{\pmod {n}}} and let the card decrypt that. So it computes  {\displaystyle y^{d}{\pmod {p}}} or  {\displaystyle y^{d}{\pmod {q}}} whose results give some value  {\displaystyle z}. Now, induce an error in one of the computations. Then  {\displaystyle \gcd(z-x,n)} will reveal  {\displaystyle p} or q.)

Encrypting messages

Alice gives her public key (  {\displaystyle n\,} {\displaystyle e\,}) to Bob and keeps her private key secret. Bob wants to send message M to Alice.

First he turns M into a number  m smaller than  n by using an agreed-upon reversible protocol known as a padding scheme. He then computes the ciphertext {\displaystyle c\,} corresponding to:

  {\displaystyle c=m^{e}\mod {n}}

This can be done quickly using the method of exponentiation by squaring. Bob then sends {\displaystyle c\,} to Alice.

Decrypting messages

Alice can recover  {\displaystyle m\,} from  {\displaystyle c\,} by using her private key  {\displaystyle d\,} in the following procedure:

  {\displaystyle m=c^{d}\mod {n}}

Given  {\displaystyle m\,}, she can recover the original message M.

The decryption procedure works because first

  {\displaystyle c^{d}\equiv (m^{e})^{d}\equiv m^{ed}{\pmod {n}}}.

Now, since

  {\displaystyle ed\equiv 1{\pmod {p-1}}\,} and
  {\displaystyle ed\equiv 1{\pmod {q-1}}\,}

Fermat’s little theorem yields

  {\displaystyle m^{ed}\equiv m{\pmod {p}}} and
  {\displaystyle m^{ed}\equiv m{\pmod {q}}}.

Since  {\displaystyle p\,} and  {\displaystyle q\,} are distinct prime numbers, applying the Chinese remainder theorem to these two congruences yields

{\displaystyle m^{ed}\equiv m{\pmod {pq}}}.

Thus,

{\displaystyle c^{d}\equiv m{\pmod {n}}}.

A working example

Here is an example of RSA encryption and decryption. The parameters used here are artificially small, but you can also use OpenSSL to generate and examine a real keypair.

  1. Choose two random prime numbers
  2.  :  {\displaystyle p=61} and  {\displaystyle q=53}
  3. Compute  {\displaystyle n=pq\,}
  4.  :  {\displaystyle n=61*53=3233}
  5. Compute the totient  {\displaystyle \phi (n)=(p-1)(q-1)\,}
  6.  : {\displaystyle \phi (n)=(61-1)(53-1)=3120}
  7. Choose  {\displaystyle e>1} coprime to 3120
  8.  :  {\displaystyle e=17}
  9. Choose  {\displaystyle d\,} to satisfy {\displaystyle de\equiv 1{\pmod {\phi (n)}}\,}
  10.  :  {\displaystyle d=2753}
  11.  :  {\displaystyle 17*2753=46801=1+15*3120}.

The public key is (  {\displaystyle n=3233}, {\displaystyle e=17}). For a padded message  {\displaystyle m\,} the encryption function is:

  {\displaystyle c=m^{e}\mod {n}=m^{17}\mod 3233\,}.

The private key is (  {\displaystyle n=3233} {\displaystyle d=2753}). The decryption function is:

  {\displaystyle m=c^{d}\mod {n}=c^{2753}\mod 3233\,}.

For example, to encrypt  {\displaystyle m=123}, we calculate

  {\displaystyle c=123^{17}\mod 3233=855}

To decrypt  {\displaystyle c=855}, we calculate

  {\displaystyle m=855^{2753}\mod 3233=123}.

Both of these calculations can be computed efficiently using the square-and-multiply algorithm for modular exponentiation.

Padding schemes

When used in practice, RSA must be combined with some form of padding scheme, so that no values of M result in insecure ciphertexts. RSA used without padding may have some problems:

  • The values m = 0 or m = 1 always produce ciphertexts equal to 0 or 1 respectively, due to the properties of exponentiation.
  • When encrypting with small encryption exponents (e.g., e = 3) and small values of the m, the (non-modular) result of m e {\displaystyle m^{e}} {\displaystyle m^{e}} may be strictly less than the modulus n. In this case, ciphertexts may be easily decrypted by taking the eth root of the ciphertext with no regard to the modulus.
  • RSA encryption is a deterministic encryption algorithm. It has no random component. Therefore, an attacker can successfully launch a chosen plaintext attack against the cryptosystem. They can make a dictionary by encrypting likely plaintexts under the public key, and storing the resulting ciphertexts. The attacker can then observe the communication channel. As soon as they see ciphertexts that match the ones in their dictionary, the attackers can then use this dictionary in order to learn the content of the message.

In practice, the first two problems can arise when short ASCII messages are sent. In such messages, m might be the concatenation of one or more ASCII-encoded character(s). A message consisting of a single ASCII NUL character (whose numeric value is 0) would be encoded as m = 0, which produces a ciphertext of 0 no matter which values of e and N are used. Likewise, a single ASCII SOH (whose numeric value is 1) would always produce a ciphertext of 1. For systems which conventionally use small values of e, such as 3, all single character ASCII messages encoded using this scheme would be insecure, since the largest m would have a value of 255, and 2553 is less than any reasonable modulus. Such plaintexts could be recovered by simply taking the cube root of the ciphertext.

To avoid these problems, practical RSA implementations typically embed some form of structured, randomized padding into the value m before encrypting it. This padding ensures that m does not fall into the range of insecure plaintexts, and that a given probe, once padded, will encrypt to one of a large number of different possible ciphertexts. The latter property can increase the cost of a dictionary attack beyond the capabilities of a reasonable attacker.

Standards such as PKCS have been carefully designed to securely pad messages prior to RSA encryption. Because these schemes pad the plaintext m with some number of additional bits, the size of the un-padded message M must be somewhat smaller. RSA padding schemes must be carefully designed so as to prevent sophisticated attacks. This may be made easier by a predictable message structure. Early versions of the PKCS standard used ad-hoc constructions, which were later found vulnerable to a practical adaptive chosen ciphertext attack. Modern constructions use secure techniques such as Optimal Asymmetric Encryption Padding (OAEP) to protect messages while preventing these attacks. The PKCS standard also has processing schemes designed to provide additional security for RSA signatures, e.g., the Probabilistic Signature Scheme for RSA (RSA-PSS).

Signing messages

Suppose Alice uses Bob’s public key to send him an encrypted message. In the message, she can claim to be Alice but Bob has no way of verifying that the message was actually from Alice since anyone can use Bob’s public key to send him encrypted messages. So, in order to verify the origin of a messages, RSA can also be used to sign a message.

Suppose Alice wishes to send a signed message to Bob. She produces a hash value of the message, raises it to the power of d mod n (just like when decrypting a message), and attaches it as a “signature” to the message. When Bob receives the signed message, he raises the signature to the power of e mod n (just like encrypting a message), and compares the resulting hash value with the message’s actual hash value. If the two agree, he knows that the author of the message was in possession of Alice’s secret key, and that the message has not been tampered with since.

Note that secure padding schemes such as RSA-PSS are as essential for the security of message signing as they are for message encryption, and that the same key should never be used for both encryption and signing purposes.

Why Rails 4 Live Streaming is a big deal

TLDR: Rails Live Streaming allows Rails to compete with Node.js in the streaming arena. Streaming requires application servers to support either multi-threaded or evented I/O. Most Ruby application servers are not up for the job. Phusion Passenger Enterprise 4.0 (a Ruby app server) is to become hybrid multi-processed, multi-threaded and evented. This allows seamless support for streaming, provides excellent backwards compatibility, and allows future support for more use cases than streaming alone.

Several days ago Rails introduced Live Streaming: the ability to send partial responses to the client immediately. This is a big deal because it opens up a huge number of use cases that Rails simply wasn’t suitable for. Rails was and still is an excellent choice for “traditional” web apps where the user sends a request and expects a full response back. It was a bad choice for anything that works with response streams, for example:

  • Progress responses that continuously inform the user about the progress. Imagine a web application that performs heavy calculations that can take several minutes. Before Live Streaming, you had to split this system up into multiple pages that must respond immediately. The main page would offload the actual work into a background worker, and return a response informing the user that the work is now in progress. The user must poll a status page at a regular interval to lookup the progress of the work. With Live Streaming, you can not only simplify the code by streaming progress information in a single request, but also push progress information to the user much more quickly and without polling:
    def big_work
      work = WorkModel.new
      while !work.done?
        work.do_some_calculations
        response.stream.write "Progress: #{work.progress}%\n"
      end
      response.stream.close
    end
    
  • Chat servers. Or, more generally, web apps that involve a large number of mostly idle but persistent connections. Until today this has largely been the domain of evented systems such as Node.js and Erlang.

And as Aaron Patterson has already explained, this feature is different from Rails 3.2’s template streaming.

Just “possible” is not enough

The same functionality was actually already technically possible in Ruby. According to the Rack spec, Rack app objects must return a tuple:

[status_code, headers, body]

Here, body must respond to the each method. You can implement live streaming by yourself, with raw Rack, by returning a body object that yields partial responses in its each method.

class StreamingBody
  def each
    work = WorkModel.new
    while !work.done?
      work.do_some_calculations
      yield "Progress: #{work.progress}%\n"
    end
  end
end

Notice that the syntax is nearly identical to the Rails controller example code. With this, it is possible to implement anything.

However streaming in Ruby has never caught a lot of traction compared to systems such as Node.js. The latter is much more popular for these kind of use cases. I believe this inequality in populairty is caused by a few things:

  1. Awareness. Not everybody knew this was possible in Ruby. Indeed, it is not widely documented.
  2. Ease and support. Some realize this is possible, but chose not to use Ruby because many frameworks do not provide easy support for streaming. It was possible to stream responses in pre-4.0 Rails but the framework code generally does not take streaming into account, so if you try to do anything fancy you run the risk of breaking things.

With Live Streaming, streaming is now easy to use as well as officially supported.

Can Rails compete with Node.js?

Node.js is gaining more and more momentum nowadays. As I see it there are several reasons for this:

  1. Love for JavaScript. Some developers prefer JavaScript over Ruby, for whatever reasons. Some like the idea of using the same language for both frontend and backend (although whether code can be easily shared between frontend and backend remains a controversial topic among developers). Others like the V8 engine for its speed. Indeed, V8 is a very well-optimized engine, much more so than Ruby 1.9’s YARV engine.
  2. Excellent support for high I/O concurrency use cases. Node.js is an evented I/O system, and evented systems can handle a massive amount of concurrent connections. All libraries in the Node.js ecosystem are designed for evented use cases, because there’s no other choice. In other languages you have to specifically look for evented libraries, so the signal-to-noise ratio is much lower.

I have to be careful here: the phrases “high I/O concurrency” and “massive ammount of concurrent connections” deserve more explanation because it’s easy to confuse them with “uber fast” or “massively scalable”. That is not what I meant. What I meant is, a single Node.js process is capable of handling a lot of client sockets, assuming that any work you perform does not saturate memory, CPU or bandwidth. In contrast, Ruby systems traditionally could only handle 1 concurrent request per process, even you don’t do much work inside a request. We call this a multi-process I/O model because the amount of concurrent users (I/O) the system can handle scales only with the number of processes.

In traditional web apps that send back full responses, this is not a problem because the web server queues all requests, the processes respond as quickly as possible (usually saturating the CPU) and the web server buffers all responses and relieves the processes immediately. In streaming use cases, you have long-running requests so the aforementioned mechanism of letting the web server buffer responses is simply not going to work. You need more I/O concurrency: either you must have more processes, or processes must be able to handle more than 1 request simultaneously. Node.js processes can effectively handle an unlimited number of requests simultaneously, when not considering any constraints posed by CPU, memory or bandwidth.

Node.js is more than HTTP. It allows arbitrary networking with TCP and UDP. Rails is pretty much only for HTTP and even support for WebSockets is dubious, even in raw Rack. It cannot (and I believe, should not) compete with Node.js on everything, but still… Now suddenly, Rails can compete with Node.js for a large number of use cases.

Two sides of the coin

Reality is actually a bit more complicated than this. Although Rails can handle streaming responses now, not all Ruby application servers can. Ilya Grigorik described this problem in his article Rails Performance Needs an Overhaul and criticized Phusion Passenger, Mongrel and Unicorn for being purely multi-process, and thus not able to support high concurrency I/O use cases. (Side note: I believe the article’s title was poorly chosen; it criticizes I/O concurrency support, not performance.)

Mongrel’s current maintenance status appears to be in limbo. Unicorn is well-maintained, but its author Eric Wong has explicitly stated in his philosophy that Unicorn is to remain a purely multi-processed application server, with no intention to ever become multithreaded or evented. Unicorn is explicitly designed to handle fast responses only (so no streaming responses).

At the time Ilya Grigorik’s article was written, Thin was the only application server that was able to support high I/O concurrency use cases. Built on EventMachine, Thin is evented, just like Node.js. Since then, another evented application server called Goliath has appeared, also built on EventMachine. However, evented servers require evented application code, and Rails is clearly not evented.

There have been attempts to make serial-looking code evented through the use of Ruby 1.9 fibers, e.g. through the em-synchrony gem, but in my opinion fibers cause more problems than they solve. Ruby 1.8’s green threading model was essentially already like fibers: there was only one OS thread, and the Ruby green thread scheduler switches context upon encountering a blocking I/O operation. Fibers also operate within a single OS thread, but you can only context switch with explicit calls. In other words, you have to go through each and every blocking I/O operation you perform and insert fiber context switching logic, which Ruby 1.8 already did for you. Worse, fibers give the illusion of thread safetiness, while in reality you can run into the same concurrency problems as with threading. But this time, you cannot easily apply locks to prevent unwanted context switching. Unless the entire ecosystem is designed around fibers, I believe evented servers + fibers only remains useful for a small number of use cases where you have tight control over the application code environment.

There is another way to support high I/O concurrency though: multi-threading, with 1 thread per connection. Multi-threaded systems generally do not support as much concurrent I/O as evented system, but are still quite formidable. Multi-threaded systems are limited by things such as the thread stack size, the available virtual memory address space and the quality of the kernel scheduler. But with the right tweaking they can approach the scalability of evented systems.

And so this leaves multithreaded servers as the only serious options for handling streaming support in Rails apps. It’s very easy to make Rails and most other apps work on them. Puma has recently appeared as a server in this category. Like most other Ruby application servers, you have to start Puma at least once for every web app, and each Puma instance is to be attached to a frontend web server in a reverse proxy setup. Because Ruby 1.9 has a Global Interpreter Lock, you should start more than 1 Puma process if you want to take advantage of multiple cores. Or you can use Rubinius, which does not have a Global Interpreter Lock.

And what about Phusion Passenger?

Towards a hybrid multi-processed, multi-threaded and evented application server

To recap, each I/O model – multi-process, multi-threaded, evented – has its own pros and drawbacks:

  • Multi-process
    • Pros:
      • Excellent application compatibility.
      • Lack of threading avoids concurrency bugs (e.g. race conditions) created by threading.
      • Simple and easy to understand. If one process crashes, the others are not affected.
      • Can utilize multiple cores.
    • Cons:
      • Supports very low I/O concurrency.
      • Uses a lot of memory.
  • Multi-threaded
    • Considerations:
      • Not as compatible as multi-process, although still quite good. Many libraries and frameworks support threaded environments these days. In web apps, it’s generally not too hard to make your own code thread-safe because web apps tend to be inherently embarrassingly parallel.
      • Can normally utilize multiple cores in a single process, but not in MRI Ruby. You can get around this by using JRuby or Rubinius.
    • Pros:
      • Supports high I/O concurrency.
      • Threads use less memory than processes.
    • Cons:
      • If a thread crashes, the entire process goes down.
      • Good luck debugging concurrency bugs.
  • Evented
    • Pros:
      • Extremely high I/O concurrency.
      • Uses even less memory than threads.
    • Cons:
      • Bad application compatibility. Most libraries are not designed for evented systems at all. Your application itself has to be aware of events for this to work properly.
      • If your app/libraries are evented, then you can still run into concurrency bugs like race conditions. It’s easier to avoid them in an evented system than in a threaded system, but when they do occur they are very difficult to debug.
      • Cannot utilize multiple cores in a single process.

As mentioned before, Phusion Passenger is currently a purely multi-processed application server. If we want to change its I/O model, which one should we choose? We believe the best answer is: all of them. We can give users a choice, and let them chose – on a per-application basis – which I/O model they want.

Phusion Passenger Enterprise 4.x (which we introduced earlier) is to become a hybrid multi-processed, multi-threaded and evented application server. You can choose with a single configuration option whether you want to stay with the traditional multi-processed I/O model, whether you want multiple threads in a single process, or whether you want processes to be evented. In the latter two cases, you even control how many processes you want, in order to take advantage of multiple cores and for resistance against crashes. We believe a combination of processes and threads/events are best.

Being a hybrid server with configurable I/O model allows Phusion Passenger to support more than just streaming. Suddenly, the possibilities become endless. We could for example support arbitrary TCP protocols in the future with no limits on traffic workloads.

phusion

 

Code has just landed in the Phusion Passenger Enterprise 4.0 branch to support multithreading. Note that the current Phusion Passenger Enterprise release is of the 3.0.x series and does not support this yet. As you can see in our roadmap, Phusion Passenger Enterprise 4.0 beta will follow 3.0.x very soon.

Where is Java used in Real World?

If you are a beginner and just started learning Java, you might be thinking where exactly Java is used? You don’t see many games written in Java except Minecraft, desktop tools like Adobe Acrobat, Microsoft Office are not written in Java, neither is your operating systems like Linux or Windows, so where exactly people use Java? Does it have any real-world application or not? Well, you are not alone, many programmers ask this question before starting with Java, or after picking Java is one of the programming language of choice at graduate level. By the way, you can get a clue of where Java is used by installing Java at your desktop, Oracle says more than 3 billion devices run Java, that’s huge number, isn’t it? Most major companies use Java in one way or other. Many server side applications are written in Java to process tens of millions of requests per day, high frequency trading applications are also written in Java e.g. LMAX trading applications, which is built over their path breaking inter-thread communication library, Disruptor. In this article, we will see more precisely, what kind of projects are done in Java, which domain or sector Java is dominating and where exactly Java is useful in real-world?

Real World Java Applications

There are many places where Java is used in real world, starting from commercial e-commerce website to android apps, from scientific application to financial applications like electronic trading systems, from games like Minecraft to desktop applications like Eclipse, Netbeans and IntelliJ, from open source library to J2ME apps etc. Let’s see each of them in more detail.

1) Android Apps
If you want to see where Java is used, you are not too far away. Open your Android phone and any app, they are actually written in Java programming language, with Google’s Android API, which is similar to JDK. Couple of years back Android has provided much needed boost and today many Java programmer are Android App developer. By the way android uses different JVM and different packaging, but code is still written in Java.

2) Server Apps at Financial Services Industry
Java is very big in Financial Services. Lots of global Investment banks like Goldman Sachs, Citigroup, Barclays, Standard Charted and other banks use Java for writing front and back office electronic trading system, writing settlement and confirmation systems, data processing projects and several others. Java is mostly used to write server side application, mostly without any front end, which receives data form one server (upstream), process it and sends it other process (downstream). Java Swing was also popular for creating thick client GUIs for traders, but now C# is quickly gaining market share on that space and Swing is out of its breath.

3) Java Web applications
Java is also big on E commerce and web application space. You have a lot of  RESTfull services being created using Spring MVC, Struts 2.0 and similar frameworks. Even simple Servlet, JSP and Struts based web applications are quite popular on various government projects. Many of government, healthcare, insurance, education, defense and several other department have their web application built in Java.

Real world application of Java

4) Software Tools
Many useful software and development tools are written and developed in Java e.g. Eclipse, InetelliJ Idea and Netbans IDE. I think they are also most used desktop applications written in Java. Though there was time when Swing was very popular to write thick client, mostly in financial service sector and Investment banks. Now days, Java FX is gaining popularity but still it is not a replacement of Swing and C# has almost replaced Swing in Finance domain.

5) Trading Application
Third party trading application, which is also part of bigger financial services industry, also use Java. Popular trading application like Murex, which is used in many banks for front to bank connectivity, is also written in Java.

6) J2ME Apps
Though advent of iOS and Android almost killed J2ME market, but still there is large market of low end Nokia and Samsung handset which uses J2ME. There was time when almost all games, application, which is available in Android are written using MIDP and CLDC, part of J2ME platform. J2ME is still popular on products like Blu-ray, Cards, Set top boxes etc. One of the reason of WhatsApp being so popular is because it is also available in J2ME for all those Nokia handset which is still quite big.

7) Embedded Space
Java is also big in the embedded space. It shows how capable the platform is, you only need 130 KB to be able to use Java technology (on a smart card or sensor). Originally Java was designed for embedded devices. In fact, this is the one area, which was part of Java’s initial campaign of “write once, run anywhere” and looks like it is paying up now.

8) Big Data technologies
Hadoop and other big data technologies are also using Java in one way or other e.g. Apache’s Java-based HBase and Accumulo (open source), and  ElasticSearch as well. By the Java is not dominating this space, as there are technologies like MongoDB which is written in C++. Java has potential to get major share on this growing space if Hadoop or ElasticSearch goes big.

9) High Frequency Trading Space
Java platform has improved its performance characteristics a lot and with modern JITs, its capable of delivering performance at C++ level. Due to this reason, Java is also popular on writing high performance systems, because Though performance is little less compared to native language, but you can compromise safety, portability and maintainability for more speed and it only takes one inexperienced C++ programmer to make an application slow and unreliable.

10) Scientific Applications
Nowadays Java is often a default choice for scientific applications, including natural language processing. Main reason of this is because Java is more safe, portable, maintainable and comes with better high-level concurrency tools than C++ or any other language.

In 1990s Java was quite big on Internet due to Applet, but over the years, Applet’s lost its popularity, mainly due to various security issues on Applet’s sand boxing model. Today desktop Java and Applets is almost dead. Java is by default Software industries darling application development language, and given its heavy usage in financial services industry, Investment banks and E-commerce web application space, any one learning Java has bright future ahead of him. Java 8 has only reinforced the belief that Java will continuing dominating software development space for years to come.

Microsoft Bets That Bitcoin-Style Blockchains Will Be Big Business

Earlier this week a consortium of 11 giant banks including UBS and Credit Suisse announced that they had completed their first trial run of the idea of using software inspired by the digital currency Bitcoin to move assets around more efficiently. It was also a test of what Microsoft thinks could be a significant new business opportunity. The experiment, coordinated by a company called R3 CEV using Bitcoin-inspired software called Ethereum, took place inside Microsoft’s cloud computing platform, Azure.

Many large banks have said they are investigating so-called blockchain technology (see “Banks Embrace Bitcoin’s Heart but Not Its Soul”), with Santander predicting this could save the industry $20 billion annually. Microsoft wants financial companies to host their blockchain software inside Azure. It has recently struck partnerships with several startups working on blockchain software for banks and other big corporations.

“We see a huge opportunity here,” says Marley Gray, who leads Microsoft’s project and is technology strategist for financial services at Azure. “Enterprise-scale and enterprise-grade infrastructure is going to be vitally important for this financial infrastructure that will be woven using blockchain over these next few years.”

The flurry of interest in blockchains is inspired by the way the software behind Bitcoin verifies and logs transactions. Each one is recorded in a public ledger known as the blockchain, maintained by a network of computers around the world. Cryptographic software verifies transactions as they are added and ensures that the historical record can’t be altered.

Banks want their blockchains to record not bitcoins but transactions in conventional financial assets, such as currencies, bonds, or derivatives. Startups and banks are also exploring a concept known as “smart contracts,” in which updates to a blockchain can add simple computer programs—for example, to automatically make a payout when a particular transaction occurs.

Banks also want their blockchains more private than Bitcoin’s, which is public and maintained by a community of strangers. Instead, companies using a particular blockchain would each run some of the software contributing to its upkeep. Gray says that doing that inside Microsoft’s cloud servers can let banks manage and deploy blockchains more easily, making them more reliable.

“I don’t think it will be solely in Azure, but it can be a backbone,” he says. Microsoft’s blockchain as a service also makes it easy to experiment with different takes on the technology as companies try to figure out what it’s good for, says Gray.

Despite much avowed interest in the technology from financial institutions, blockchains are not yet being put to work in any meaningful way. IBM, Cisco, and Intel recently formed an open-source project that will develop open-source blockchain software, but the most developed versions of the concept come from startups still testing and refining their products.

The mismatch between banks’ ambitions and the embryonic state of blockchain deployments has led to complaints the idea is overhyped. Chris Larsen, CEO of Ripple, a company with cryptographic ledger software being tested by partners including Accenture, says Microsoft’s involvement can help assuage such fears. “Microsoft adds credibility as to where the industry is going,” he says. Since last month, Microsoft has been running one of the software “nodes” that power Ripple’s ledger technology.

Still, to get beyond just experiments—and for Microsoft’s blockchain platform to become a significant source of income— this new approach will need to become as useful and reliable as more conventional approaches to managing corporate data.

“We should be comparing ourselves with other infrastructure companies like the Oracles and SAPs of the world,” says Chris Finan, CEO of Manifold Technology, which is testing its blockchain software with partners including the Royal Bank of Canada, and which is also a partner on Microsoft’s blockchain platform. “We need to prove why this kind of infrastructure is more efficient.”