Ruby on Rails in Machine Learning – Yay or Nay?

Machine Learning is a trending field of Computer Science turning computer’s computations into a new level and giving a number of unique opportunities. It’s getting more and more popular, and it’s common for modern web application as well as services, such as Netflix, Spotify, Amazon.com and Facebook. Machine Learning is a good solution for apps based on recommendations or some kind of predictions. If you want to build such apps, you will need an efficient backend technology to support it. Is Ruby on Rails the right choice?

https://ericplayground.files.wordpress.com/2017/10/9a99f-1yc4ub-q9m5kzqbjakrgfrq.png?w=918&h=688

What’s Machine Learning?

The most famous definition of Machine Learning is the subfield of computer science giving computers the ability to learn without being explicitly programmed.
In fact, it gives a nice clue of what it is all about. Machine Learning is a part of data science. It is used when we want to use computers to predict unknown results based on related bulky data sources. It is a good way to discover any kind of uncertainty, such as recommendations, predictions or detections of described situations. We don’t need to plan and implement any algorithms. We say that a computer gains  the ability to be smart and learn new things.

How Does It Work?

We don’t need to specifically instruct a computer on what to do. If you want to be smart and predict uncertain values applications, use specific structures and tools, e.g. neural nets. These technologies learn new facts and make predictions in a way which is similar to that of the human brain. For example, neural nets are composed of layers of units called neurons. We see a clear analogy of how it could work.

What Do We Need?

As Machine Learning is a part of Data Science, it is a composition of various mathematical computations. It means that an application uses the technology needs to provide complex calculation fast. Since it’s not a trivial software problem, we need to take care of the best tool choices.

Is Ruby on Rails a Good Choice for Machine Learning?

Ruby is an elegant programming language which found its role in the web development and scripts. With the help of Ruby and Rails framework, developers can build MVPs in a way which is both fast and stable. This is thanks to  the availability of various packages called gems, which helps solving diverse problems speedily.

However, looking for the Machine Learning gems, we can conclude that the choice is not that rich. Going deeper, the described solutions are not documented enough. The reason is that they do not provide efficient computation speed and gather a too small community around. All these factors attest to the fact that there are more risks than advantages of using Ruby gems as Machine Learning solutions, and it is not the best choice after all.
Moreover, tools and packages are as useful as the language of development. Ruby is definitely one of the most interesting programming languages. It has many proved purposes, but fast computing is not one of them. Ruby does not match Machine Learning, and we need to look into something better.

What’s the Alternative?

Python is also a popular programming language which is often used in Data Science projects because:

  • It has numerous packages for Machine Learning and other computations. The prime examples are numpy, pandas, keras, tensorflow. These packages are well-documented, which is helpful in starting with new projects and solutions. It also speeds up the process of fixing bugs.
  • Its libraries are simply powerful. It means that they comprise many features helpful in complex computations. The development is fast, efficient and stable. It is also common that they use a range of computation speed improvements. All of these advantages make these tools mature and reliable.
  • Another important advantage of using Python libraries is a considerable support from the community as the developers can easily find tutorials and tips valuable in a development process. A stable community makes the start threshold lower – it is easier to use new technologies from scratch.
  • Python is a developer-friendly language which is easier to start with for Ruby on Rails developers comparing to other, lower-level programming languages. The syntax is intuitive, and it parallels the one in Ruby more than other popular languages.

Tensorflow

We need to choose the best Python library for the Machine Learning purposes. We recommend using Tensorflow, a popular and powerful tool from Google. It provides stable implementation for Python, C++ and many other programming languages. We decide to use Tensorflow for the benefits it provides:

  • it has an excellent documentation, a bunch of helpful tutorials and howtos, which helps developers go deep into the Machine Learning solutions;
  • it performs all complex calculations “behind” Python – it uses a unique computational engine and leaves Python free of heavy operations;
  • it allows building neural nets and other Machine Learning structures like graphs and chains of single operation blocks;
  • it allows using Graphics Processing Unit for a much better performance.

RoR as a Web Application for ML

Ruby on Rails is a perfect choice for web development. It gives developers the possibility to kick off a stable MVP really fast. However, it does not guarantee the best performance and the quality of complex and heavy computations.
Based on the above, it is a good idea to connect the brilliance of Ruby on Rails framework with Python as a microservice performing Machine Learning computations. This architecture gives us the mix of the best computation efficiency and web application development stability. It minimises the time of building a prototype and provides the best quality of usage.

What are the main benefits of such a combination?

  • It’s easy and convenient to connect our app with other microservices. The Rails framework provides many reliable ways of communication between different services. It does not break the integrity with the core services.
  • Rails is great for building MVPs. Developers can build a web application fast and pitch it to investors.
  • It is a stable solution with a really good documentation. Moreover, there are many famous companies which trusted this framework and built efficient software.
    Active community support also makes this choice smart.
  • With the help of gems, Rails packages, developers can quickly build more complex parts of an application.

How to Connect Microservices in Python with RoR

So, we chose Ruby on Rails as a web application framework and Python with Tensorflow as a Machine Learning microservice. Great! The very last element of our technology chain is the efficient connection between these two endpoints. It is important to choose connection technology carefully. Let’s compare two most common options:

HTTP Communication

The first option is the HTTP communication. It is definitely the most popular way of connecting the two services. The most popular one may not be the best. HTTP protocol is getting older and older. There are still some boilerplates, issues and difficulties on our way. Moreover, it is said that the protocol does not provide the best speed in all cases. We are looking for the best efficiency in each step of development, so it is worth finding something better.
Secondly, this type of communication needs lots of effort in Rails and Python. It results in more time needed to build a stable communication solution. In the case of Ruby on Rails, it is quite straightforward, but we also need an endpoint in Python. If we chose HTTP, we would have to use an additional tool to build it on the Python side. It breaks Single Responsibility Principle which we are aware of.
Overall, HTTP is quite complex itself. We need a really simple and efficient way.

RabbitMQ

The second option is a tool called RabbitMQ. It is stable, fast and it is growing in popularity among developers. The communication model is simpler and faster than the HTTP protocol. An outstanding documentation and examples ensure that developers easily start using this tool. Another important advantage is the presence of truly solid Ruby and Python libraries providing RabbitMQ communication in these languages. It makes the usage really easy and stable.
Using RabbitMQ is a good choice for connecting the Rails web application and Python microservice.

Wrap-up

The proposed architecture of web application using Machine Learning features has both its strong and weak sides. Web application development is stable, and it’s possible to use Ruby gems to build a web application fast. It ensures great efficiency of Machine Learning computations thanks to the Python and Tensorflow library. Finally, the connection between both services is fast and safe.

On the other hand, you should consider the downsides as well. It has a bit more complex architecture model at the cost of creating an almost perfect solution.

In the case of Machine Learning ecosystem, it is better to mix different technologies and select the best tools to support them than rely on standalone choices which are not always as good as they seem.

Advertisements

Pebble’s founder is back on Kickstarter with an iPhone battery case that also charges AirPods

Pebble’s founder and former CEO Eric Migicovsky is back with his first new product since the smartwatch maker got rolled up into Fitbit last year. The PodCase doesn’t have the same sort of grand ambitions as his last project, but the new case is the kind of clever one-off product Kickstarter was designed to deliver.

Up top, a pair of AirPod slots sit just to the right of the case’s camera cut out, so users can charge Apple’s Bluetooth earbuds using the same 2,500 mAh battery that keeps the phone powered. The industrial design certainly looks solid, courtesy of Pebble lead designer Steve Johns — though the Mophie comparisons are pretty much unavoidable right out of the gate.

And, of course, between the two Pebble alum and Allen Evans, a co-founder of Glyph video headset makers Avegant, the PodCase has a pretty solid foundation. After all, during its half-decade existence, Pebble claimed three of the top five Kickstarter campaigns of all time — though, admittedly, the latest didn’t ultimately end up well for anyone, save, of course, for Fitbit.

In a conversation with TechCrunch, Migicovsky is quick to point out the team’s somewhat unorthodox approach to this release. It’s a cautious take on a product launch from a team of folks who have witnessed the ups and downs of launching a hardware startup firsthand. The trio has provided the initial funding to help bring the product to Kickstarter, and then it will play things by ear from there. In fact, the group refuses to refer to itself as a proper company, instead calling Nova Technologies, “a small group of technologists” on its new Kickstarter page.

“There are some products that don’t need an entire company around them,” says  Migicovsky. “We probably won’t need to scale up to meet demand, but if we do, we can scale up to meet that demand. We’re not building a company that’s selling more and more of these cases every year or having to have a hit in order to meet our revenue numbers. If it sells, great. If it doesn’t, it’s not the end of the world. We have a lot of other projects behind this, waiting in the backlog.”

It’s a fairly zen approach from founders who have been through the wringer, but PodCase is a pretty clever first step that alleviates having to carry around an additional AirPod charging case. And the fact that it keeps the headphones with the user’s iPhone when not in use should lessen the likelihood of leaving them behind.

The PodCase Kickstarter page is live, offering up the accessory in both iPhone 7 and iPhone 7 Plus sizes, with the option to switch an order to the iPhone 8 after the phone is announced next week. If the group hits its $300,000 goal, it expects to start shipping in February.

Kickstarter’s most successful fundraiser shares lessons from a failed campaign

PodCase’s search for $300,000 on Kickstarter has ended — not with a bang but a whimper. Earlier this week, the company posted an update to its page, explaining that it would not be continuing the campaign after having pulled in less than a tenth of its goal, with around three days left.

“As I’m sure you can see, this project was way less successful than we had intended,” the product’s creators noted. “Unfortunately it will not be funded and we will not be able to manufacture PodCase as it stands today, at least on the timeline that we were aiming for.”

The project was notable not just for its clever solution to the problem of carrying around an extra AirPods case, but also for the team involved. The project was the work of Avegant co-founder Allen Evans, Pebble lead designer Steve Johns and the bygone smartwatch startup’s founder, Eric Migicovsky. With that sort of pedigree, it was a bit of a surprise to see the project come up so short.

After all, Pebble currently commands three of the top five Kickstarter projects of all time (joined by a “cooperative nightmare horror game experience” and, naturally, a party cooler). Of course, PodCase hedged its bets a bit with an early press push, noting that the company wasn’t actually a company, per se. It was more the work of a few industry vets attempting to change the conversation around what it means to be a hardware startup.

As Migicovsky told me early last month, “There are some products that don’t need an entire company around them.” The idea behind the PodCase’s launch was to get all of the required funding in one fell swoop. In other words, the team would never seek outside investor funding and instead it would simply treat each product as its own self-contained project. Create a product, fund it, release it, repeat.

“We were trying to run this experiment where we were trying to see if we could fund this entire experiment on the back of one Kickstarter campaign,” Migicovsky told me on the phone earlier today. “I think the answer is no, at least for these products. Imagine if we had done it the other way, where people see the barrier really low and spend a lot of money on advertising and it’s not successful. Then you’re stuck holding the bag. You need to either find investors that will post up the cash to fund the same operation that you’d already promise your Kickstarter backers. I think we were a little more honest and upfront with people.”

Of course, such an undertaking also requires a lot of money upfront, which is the primary reason behind the PodCase’s lofty $300,000 all or nothing goal. Launching a Kickstarter campaign always requires a level of mental math, weighing demand against financial need. And when you’re rejecting the possibility of external funding, the latter increases dramatically. In the case of this campaign, the ultimate number missed the mark — by a lot.

It’s not that there was no demand for the product. It was certainly a clever approach, and the company pulled in 325 backers, but a lot of different pieces have to line up perfectly to make someone a potential customer for a product like this.

 “We [overestimated] the number of people that were interested in solving this problem at the expense of another case on top of their phone,” Migicovsky says. “The majority of the AirPod base we were going after didn’t overlap with the group of people who were interested in putting cases on their phone, at least in the configuration that we showed off.”

To the casual observer, it also appears as though the team’s intentional lack of resources came into play here. Migicovsky waves off the notion that a proper PR team is necessary to a successful Kickstarter campaign, but when you’re looking for a minimum of hundreds of thousands, it’s certainly a big help.

The team’s relative nonchalance about the whole thing means, perhaps, that its inability to meet the goal is a little less heartbreaking — for them, at least. At the very least, it leaves fewer people in the lurch — the all or nothing approach means that backers are disappointed, but not out $70 or $80. Instead, Migicovsky is approaching the whole thing as a sort of learning experience.

“We treated it like a fishing expedition,” he says. “We tried to just see if people were interested. The other way was what we did with Pebble, where we’d already been working on it for four years, we’d already launched our first version, we already had a couple thousand users, and we knew that we’d take the feedback from the early versions and funnel it into Pebble. This was the first shot in the dark with this concept and we got the feedback we needed.”

As far as what that means for the future of this project specifically, the team isn’t ruling out the possibility of another go at the PodCase, perhaps with a focus on the iPhone X. The overlap between iPhone X and AirPod users is probably pretty large. And people are likely going to want a case to protect their $1,000 phone.

“I have no idea one way or another,” says Migicovsky, “but it doesn’t represent a blocker for me. I spent four years launching various versions of what would become Pebble before launching it on Kickstarter. That was never a mental blocker for us.”

Honest Company may be raising a down round

The Honest Company, the five-year-old natural body and home care products company cofounded by the actress Jessica Alba, looks to be raising $75 million in new venture capital funding at $19.60 per share, according to a Delaware filing first spied by CBInsights and reported by Axios.

The amount is a far cry from the $45.75 per share price point of the company’s $100 million Series D round, closed in 2015 at what was reportedly a post-money valuation of $1.7 billion.

It also endangers The Honest Company’s coveted — or problematic, depending on your viewpoint — status as a so-called unicorn company.

While boasting a billion-dollar valuation puts companies in somewhat elite company with other richly valued private companies, high-flying valuations can also limit a company’s exit options.

The Honest Company may have already proved too rich for at least one acquirer. Roughly a year ago, the outfit was reported to be in talks with Unilever about a potential tie-up; soon after, Unilever opted instead to acquire Honest competitor Seventh Generation for $600 million.

Last year, the WSJ reported that Honest was generating $300 million in annual revenue after raising more than $220 million from investors, including General Catalyst Partners, Lightspeed Venture Partners, Institutional Venture Partners, Fidelity, Wellington Management and Hartford Financial.

The company hasn’t enjoyed smooth sailing since, seemingly. Honest cofounder Brian Lee stepped down as CEO, replaced by former Clorox executive Nick Vlahos, who has been tasked with positioning Honest as a more traditional packaged goods company. (Lee is a renowned tech entrepreneur whose past companies include Legal Zoom and ShoeDazzle.)

The company also cut 80 jobs in the first quarter of this year as it pushed into more offline channels. Indeed, while at the outset, Honest sold its products exclusively at its own website, its various products are also available to buy today at Target, Whole Foods, CVS, Nordstrom, and elsewhere.

The company has also found itself fending off a number of lawsuits over the years from consumer advocacy groups concerned about its product labeling. We talked with Alba about those suits last year in an on-stage discussion at our Disrupt show in New York.

We hope to have more on the new round soon. In the meantime, we reached out to an Honest Company representative for comment and were sent the following statement:

As a matter of policy, The Honest Company does not publicly comment on matters related to our financing activities or valuation, except to say that we are well-capitalized to execute on our long-term strategy.

Our team is focused on executing a plan that builds on our success to date and transforms Honest into a true omni-channel company that delivers the most authentic, engaging and seamless customer experience possible, wherever our customers shop.

In keeping with this strategy, we’re investing heavily in our sales, R&D, brand & retail marketing and fulfillment teams, and we have made several key changes at the management level, all as part of the strategic shift from e-commerce to omni-channel to drive company performance.

We have also begun to assess our international strategy as we look toward the future and our goal of creating a truly global brand.

We seek to provide baby, beauty, personal care and home care products which delight modern consumers and families everywhere with their safety, design and performance, and are focused on making our products accessible to as many people as possible.

 

Uber only has itself to blame for London license loss

Uber only has itself to blame for London license loss

The tech industry’s over-processed supply of irony might not be enough to service all the ramifications of Uber being stripped of its London license by the city’s transport regulator.

Uber advocates were immediately scrambling to bust out the reactionary clichés — painting the regulator as “anti-innovation” and claiming London is now ‘closed for digital business’. (A point that might have more substance if they were talking about Brexit.)

Guys. Spare us. Please.

NB: A regulator’s job is literally to uphold a set of standards on behalf of the public, not to bow down before your shiny app.

The old ‘They’ve caved to the taxi cartels and/or the unions!’ refrain was also wheeled out and waxed off. Harder to spot: Any mention of how much Uber spends on lobbying lawmakers to influence regulatory decisions in its commercial favor.

Nor how Uber mobilizes its app infrastructure to create thousands-strong lobbying armies to apply pressure to city authorities at key moments of regulatory threat.

So — quelle surprise! — there’s already a petition with hundreds of thousands of signatures against TfL’s decision. A petition set up and promoted by, er, Uber, of course…

At the same time, some genuinely outraged London Uber users, who have become accustomed over the past five+ years to a VC-subsidized regime of unsustainably cheap cab rides, have taken to social media to cry that it’s simply not fair!

And to wonder aloud how they’ll be able to go anywhere without Uber. This in a city that has one of the most extensive and accessible public transport networks in the world — not to mention a large number of private hire vehicle companies other than Uber, some of which can also be summed by an app (such tech! much innovation! wow).

How will we get home safety now, fretted others — apparently untroubled by the fact that London’s Met Police had informed the regulator Uber was failing to report sex attacks by drivers on its platform. TfL cited Uber’s “approach to reporting serious criminal offenses” as a contributing factor to its decision to withdraw licensing.

The deepest irony of all is that Uber can continue to operate in London while it appeals the regulator’s decision. Which will, at very least, take months. It could take years.

Being told you’re not “fit and proper” to operate a service yet allowed to keep operating your service? Tell me again exactly how London is ‘closed for digital business’?

Uber for a laundry list of scandals

Corporate social responsibility? Uber’s company fabric has demonstrably been cut from a very different kind of cloth. That’s why its new CEO is right now having to triage a laundry list of scandals — from dealing with an internal culture of sexism and bullying; to privacy and security failings so massive Uber just had to agree to two decades of oversight by a US regulator; to what appears to be a disturbing habit of building software tools that aim to blur the line of legality — such as by helping it evade regulators or slurp data from rivals.

Meanwhile Uber intones that TfL’s decision will “put more than 40,000 drivers out of work”. And claims it’s going to court to “defend the livelihoods of all those drivers”.

Yes, this really is the same company that studiously avoids ’employing’ any of those thousands of platform dependents — rather it categorizes them as ‘self-employed contractors’. Being ‘in work with Uber’ means accepting the risk and responsibility of being precariously managed by a technology entirely beyond your control.

Uber has even tried to monetize that insecurity by selling personal injury and illness insurance to its drivers. How very innovative indeed! Such a shame it doesn’t provide sick pay in exchange for sweating toil in the first place.

In a test case last year, a UK employment tribunal disagreed with Uber’s classification of drivers as self-employed contractors — ruling the company must pay the individuals in question the national minimum wage, as well as cover holiday pay and provide adequate work breaks.

Uber’s business has of course been structured to try to avoid the expensive rights of millions and millions of workers landing on its balance sheet. Despite the fact that, without the labor (and possessions) of all those drivers it wouldn’t be able to deliver its service.

Displaying a very black sense of humor, Uber calls its powerless platform precariat “partners”. Even as it routinely instructs its lawyers to appeal decisions seeking to expand drivers’ rights. And even though it fought for so long against adding a tips option to its platform. (It routinely challenges any moves by cities trying to raise safety standards for Uber users too.)

But politicians are waking up to gig economy regulation. As indeed are gig economy workers. That Uber employment tribunal ruling looks like both warning klaxon and tip of a titanic iceberg.

So if you’re an entrepreneur, and circumventing employment regulation is your benchmark for ‘innovation’, it’s really time to get a new playbook.

In Europe, governments are as un-fond of seeing their tax bases shrinking as workers are their rights evaporating. While legal minds do appear to have grokked how a tech business which replaces human managers with an app that barks orders is still, er, managing workers.

Europe also appears to be approaching a consensus legal view that a tech platform whose primary business is the delivery of transport services is — wait for it — a transportation company. And should therefore be regulated as a transportation company.

The legal mists Uber has exploited for so long look to be clearing.

And so if your ‘innovative’ business model is intent on siphoning ‘disruptive fuel’ from the tightly managed labor of thousands of people who you won’t classify as workers, you might find VCs aren’t as elated by your pitch as you imagined.

Mark Tluszcz, CEO at VC firm Mangrove Capital Partners, had this cautionary warning following the Uber decision: “There are fundamental issues with the business models of many gig economy companies. While they offer great services and excellent value for money, they are often dependent on not paying salaries, taxes and insurance.”

Oops!

But no matter — none of that stuff is a barrier to Uber using the precarious livelihoods of its non-employees as an emotive cry for a brake on the TfL regulatory decision right now, and as the claimed justification for what could be years of legal action and uncertainty as it seeks to force the regulator into reverse.

Now don’t get me wrong. TfL isn’t perfect by any means. You can certainly — and people have — call out the regulator for letting Uber operate for more than five years in the face of mounting concerns. (Or, well, you could say it was demonstrating that London is open for digital business?)

Arguably it could and perhaps should have stepped in sooner to investigate issues being raised. Although it would surely have faced the same or an even more fierce cry of ‘anti-innovation’ had it moved to strip Uber’s license earlier.

The most biting response to TfL’s decision came from James Farrar, co-claimant in the Uber employment tribunal decision, who described it as “a devastating blow for 30,000 Londoners who now face losing their job and being saddled with unmanageable vehicle related debt”.

Although his assessment does also underline exactly how precarious it is for anyone to put their faith in a rights’ less platform to be their forever reliable non-employer.

I mean, this is also a company that has publicly stated its ambition is to remove human drivers from its business equation entirely — and replace them with autonomous machines. So its ‘partnership’ offer has always come with plenty of caveats.

But Farrar’s suggestion that TfL should have sought to “strengthen” its regulatory oversight earlier does have some merit. Specifically he says it should have curbed Uber’s “runaway licensing” and sought to protect “the worker rights of drivers”.

It’s the best critique I’ve seen of TfL’s ruling. However it does risk eliding the public safety issue.

As indeed do many of the male voices that have been so quickly raised to speak up for Uber and to brand TfL as ‘anti-innovation’.

Perhaps that’s unsurprising, given it’s women who are disproportionately the victims of sex crimes.

For most men a ride home with a stranger probably sounds like a welcome convenience. For most women the first consideration before getting into a car alone is: Is this going to be safe?

And on the topic of safety, did you hear the story of how an Uber user in the U.S. who was raped by an Uber driver in India is now suing the company for privacy violations after it emerged Uber’s president of business in AsiaPac had accessed, and was carrying around, her medical records? It’s hard to imagine a more textbook example of failing on all counts at corporate social responsibility.

The bottom line is a regulator’s responsibility is to ensure the entities it grants licenses to are up to its accepted standard. And TfL evidently believes it’s seen enough bad stuff attached to Uber’s business operations in London to merit revoking its pass to operate.

Given how tattered Uber’s corporate reputation is, who can blame them?

Even Uber’s new CEO has conceded this point — in an internal letter to staff about the London license loss, which was leaked to a journalist, he writes: “The truth is that there is a high cost to a bad reputation.”

The end of the road for antisocial?

Regulators are also, as a rule, underfunded and overworked. These public bodies don’t enjoy the kind of VC largess that allows an entity like Uber or Facebook to aspire to ‘move fast and break things’. So it’s unrealistic — and more than a little ridiculous — to demand that a small public body like TfL funds lengthy interventions aimed at educating far better resourced corporate giants on being socially responsible and on ensuring public safety.

The massive asymmetry between the understaffed regulatory overseers of civic society and the elite techno disruptors, stuffed to the gills with the finest engineers money can buy (but apparently no one who passed a course in ethics), has clearly enabled certain tech entities to accelerate their business growth at the expense of responsibility.

At times some are essentially dispensing with legality.

Uber grew by ignoring extant transport rules. Indeed, in the past, it was proudly and loudly breaking such rules. Told by a German court in 2014 to cease operating nationwide, Travis Kalanick era Uber told the judges to stuff their injunction and pressed the pedal to the metal.

So there’s another rich irony to Uber’s new CEO now pleading with the London regulator not to apply its rules, and calling for it to “work with us to make things right”… But hey, at least he’s gaslighting nicely.

While, in the case of another platform giant — Facebook — the result of being powered by a business logic that’s 100% geared towards commercial optimization at massive scale is currently being liberally painted across U.S. political headlines.

And across the prematurely aged visage of its remorseful-in-retrospect founder…

Facebook is a content-curating company that, until very recently, resisted being classified as a media company. For as long as possible it sought to eschew any kind of editorial responsibility for the user generated content flowing across its platform — even as its fleet of engineers worked to tune algorithms to distribute content at an unprecedentedly vast scale and with an invasively exact degree of interest-targeting.

‘But we didn’t think of that’, it bleats now, in response to the revelation that its ad system allowed the micro-targeting of ads to users with a stated preference for ‘burning Jews’.

‘We just didn’t imagine this vast anyone-can-advertise-to-anyone platform might be used by Kremlin agents — even though, well, they paid us in Rubles and hailed from a known pro-Putin troll farm,’ it now finds itself having to say.

It’s a vastly disingenuous response to a crisis entirely of Facebook’s own making.

Social responsibility? Oh hell no! We’re just engineers.

Here’s the postmortem on Facebook’s antisocial fuck-up: If your business is building powerful tech tools that you make freely available to almost anyone who wants to use them, and yet you also refuse to accept responsibility for ensuring those tools are not also misused at scale, then don’t be too surprised when the monster you’ve unleashed comes back to bite your personal political ambitions in the ass, Zuck.

Turns out if you’re truly fixated on moving fast and breaking stuff — and you have enough VC cash behind you to fuel your one-way rocket — you actually can end up breaking some really, really, REALLY big stuff — like, er, democracy…. Thing is, no one is clapping now are they Facebook? (Well, no one outside Russia.)

Uber’s bending of the transport rulebook might seem to pale in comparison beside Facebook’s insistence that ads on its platform are just another type of ‘user content’ to be inserted into anyone’s eyeballs so long as you hand it a little bit of fiat currency.

But the harm is actually more immediately obvious.

Those thousands of London Uber drivers who bought into its platform on the vague promise of a ‘partnership’. Who took out loans to fund the shiny vehicles that Uber’s business relies on. They’re the ones saddled with horrible uncertainty and terrible risk.

They have all the responsibility, and none of the rights.

And let’s not forget all the unseen risk being absorbed by individual Uber users getting into cars with strangers and taking at face value the company’s claims it is be safe for them to do so.

The regulator’s verdict is that no, actually, we are not convinced it is safe for you to get in the car.

Frankly this has nothing to do with innovation. And everything to do with how poorly Uber has operated as a company to have reached such a very low pass.

“We wouldn’t say that a car with no speed limits or seat belts is an innovative car. Innovation is precisely about coming up with new solutions to problems. Solutions that create more problems than they solve are not really solutions,” says Gemma Galdón, founder and CEO at data consultancy Eticas Research commenting on TfL’s Uber verdict.

“While Uber is free to design its business model, regulators need to ensure that the framework they operate in protects fundamental rights and values, including workers rights… If Uber cannot come up with a business model that, is both innovative and compliant with the law, this may say more about Uber‘s innovation capacity than about the regulator, who is just doing its job.”

“Not all tech innovations try to thrive regardless of their impact on labor rights, the environment or social inequalities,” she adds. “In the future, non-civic tech should be as unthinkable as cars without speed limits or belts.”

There’s yet another irony here: By failing to apply its ride-hailing technology in a socially responsible way Uber has made it more possible for fast-following competitors to elbow in and address those corporate failures — such as by offering a better ‘partnership’ package for drivers. Or by finding ways to make London’s more rigorously regulated black cabs more affordable for people to use.

Although Uber’s main weapon to stave off competition thus far has been to drive down fare prices. But even Uber can’t burn VC cash forever. It will have to raise prices to turn a profit or it can’t hope to deliver the necessary return to its many investors.

Analysis suggests its investors are subsidizing the cost of rides to the tune of around 60 per cent. Which means that that Uber trip which cost you £8 actually cost £20. Not so ‘price disruptive’ now, eh.

And given how many of the London Uber users complaining about TfL’s decision to strip the company of its license say it’s Uber’s “affordability” that they love, I’d wager that an Uber that charged fares far closer to the rates of London’s black cabs wouldn’t find itself half so popular.

On demand ride-hailing apps? They aren’t as innovative as they used to be. The question now is: What else does your business offer us?

You Are Not Google

Software engineers go crazy for the most ridiculous things. We like to think that we’re hyper-rational, but when we have to choose a technology, we end up in a kind of frenzy — bouncing from one person’s Hacker News comment to another’s blog post until, in a stupor, we float helplessly toward the brightest light and lay prone in front of it, oblivious to what we were looking for in the first place.

This is not how rational people make decisions, but it is how software engineers decide to use MapReduce.

As Joe Hellerstein sideranted to his undergrad databases class (54 min in):

The thing is there’s like 5 companies in the world that run jobs that big. For everybody else… you’re doing all this I/O for fault tolerance that you didn’t really need. People got kinda Google mania in the 2000s: “we’ll do everything the way Google does because we also run the world’s largest internet data service” [tilts head sideways and waits for laughter]

How many stories are your data center buildings? Google chose to stop at 4, for this one in Mayes County, Oklahoma.

Having more fault tolerance than you need might sound fine, but consider the cost: not only would you be doing much more I/O, you might be switching from a mature system—with stuff like transactions, indexes, and query optimizers—to something relatively threadbare. What a major step backwards. How many Hadoop users make these tradeoffs consciously? How many of those users make these tradeoffs wisely?

MapReduce/Hadoop is a soft target at this point because even the cargo culters have realized that the planes ain’t en route. But the same observation can be made more broadly: if you’re using a technology that originated at a large company, but your use case is very different, it’s unlikely that you arrived there deliberately; no, it’s more likely you got there through a ritualistic belief that imitating the giants would bring the same riches.

Ok, so yes: this is another “don’t cargo cult” article. But wait! I have a helpful checklist for you, one you can use to make better decisions.

Cool Tech? UNPHAT.

Next time you find yourself Googling some cool new technology to (re)build your architecture around, I urge you to stop and follow UNPHAT instead:

  1. Don’t even start considering solutions until you Understand the problem. Your goal should be to “solve” the problem mostly within the problem domain, not the solution domain.
  2. eNumerate multiple candidate solutions. Don’t just start prodding at your favorite!
  3. Consider a candidate solution, then read the Paper if there is one.
  4. Determine the Historical context in which the candidate solution was designed or developed.
  5. Weigh Advantages against disadvantages. Determine what was de-prioritized to achieve what was prioritized.
  6. Think! Soberly and humbly ponder how well this solution fits your problem. What fact would need to be different for you to change your mind? For instance, how much smaller would the data need to be before you’d elect not to use Hadoop?

You Are Also Not Amazon

It’s pretty straightforward to apply UNPHAT. Consider my recent conversation with a company that briefly considered using Cassandra for a read-heavy workflow over data that was loaded in nightly:

Having read the Dynamo paper, and knowing Cassandra to be a close derivative, I understood that these distributed databases prioritize write availability (Amazon wanted the “add to cart” action to never fail). I also appreciated that they did this by compromising consistency, as well as basically every feature present in a traditional RDBMS. But the company I was speaking with did not need to prioritize write availability since the access pattern called for one big write per day. 🤔

Amazon sells a lot of stuff. If “add to cart” occasionally failed, they would lose a lot of money. Is your use case the same?

This company considered Cassandra because the PostgreSQL query in question was taking minutes, which they figured was a hardware limitation. After a few questions, we determined that the table was around 50 million rows and 80 bytes wide, so would take around 5 seconds to to be read in its entirety off SSD, if a full FileScan were needed. That’s slow, but it’s 2 orders of magnitudes faster than the actual query. 🤔

At this point, I really wanted to ask more questions (understand the problem!) and had started weighing up about 5 strategies for when the problem grew (enumerate multiple candidate solutions!), but it was already pretty clear that Cassandra would have been the wrong solution entirely. All they needed was some patient tuning, perhaps re-modeling some of the data, maybe (but probably not) another technology choice… but certainly not the high-write availability key value store that Amazon created for its shopping cart!

Furthermore, You Are Not LinkedIn

I was surprised to discover that one student’s company had chosen to architect their system around Kafka. This was surprising because, as far as I could tell, their business processed just a few dozen very high value transactions per day—perhaps a few hundred on a good day. At this throughput, the primary datastore could be a human writing into a physical book.

In comparison, Kafka was designed to handle the throughput of all the analytics events at LinkedIn: a monumental number. Even a couple of years ago, this amounted to around 1 trillion events per day, with peaks of over 10 million messages per second. I understand that Kafka is still useful for lower throughput workloads, but 10 orders of magnitude lower?

The sun, while massive, is only 6 orders of magnitude larger than earth.

Perhaps the engineers really did make an informed decision based on their expected needs and a good understanding of the rationale of Kafka. But my guess is that they fed off the community’s (generally justifiable) enthusiasm around Kafka and put little thought into whether it was the right fit for the job. I mean… 10 orders of magnitude!

You Are Not Amazon, Again

More popular than Amazon’s distributed datastore is the architectural pattern they credit with enabling them to scale: service-oriented architecture. As Werner Vogels pointed out in this 2006 interview by Jim Gray, Amazon realized in 2001 that they were struggling to scale their front end, and that a service-oriented architecture ended up helping. This sentiment reverberated from one engineer to another, until startups with just a few engineers and barely any users started splintering their brochureware app into nanoservices.

But by the time Amazon decided to move to SOA, they had around 7,800 employees and did over $3 billion in sales.

The Bill Graham Auditorium in San Francisco has capacity for 7,000 people. Amazon had around 7,800 employees when it moved to SOA.

That’s not to say you should hold off on SOA until you reach the 7,800 employee mark… just, think for yourself. Is it the best solution to your problem? What is your problem exactly, and what are other ways you could solve it?

If you tell me that your 50-person engineering organization would grind to a halt without SOA, I’m going to wonder why so many larger companies do just fine with a large but well-organized single application.

Even Google Is Not Google

Use of large scale dataflow engines like Hadoop and Spark can be particularly funny: very often a traditional DBMS is better suited to the workload, and sometimes the volume of data is so small that it could even fit in memory. Did you know you can buy a terabyte of RAM for around $10,000? Even if you had a billion users, this would give you 1kB of RAM per user to work with.

Perhaps this isn’t enough for your workload, and you will need to read and write back to disk. But do you need to read and write back to literally thousands of disks? How much data do you have exactly? GFS and MapReduce were created to deal with the problem of computing over the entire web, such as… rebuilding a search index over the entire web.

Hard drives prices are now much lower than they were in 2003, the year the GFS paper was published.

Perhaps you have read the GFS and MapReduce papers and appreciate that part of the problem for Google wasn’t capacity but throughput: they distributed storage because it was taking too long to stream bytes off disk. But what’s the throughput of the devices you’ll be using in 2017? Considering that you won’t need nearly as many of them as Google did, can you just buy better ones? What would it cost you to use SSDs?

Maybe you expect to scale. But have you done the math? Are you likely to accumulate data faster than the rate at which SSD prices will go down? How much would your business need to grow before all your data would no longer fit on one machine? As of 2016, Stack Exchange served 200 million requests per day, backed by just four SQL servers: a primary for Stack Overflow, a primary for everything else, and two replicas.

Again, you may go through a process like UNPHAT and still decide to use Hadoop or Spark. The decision may even be the right one. What’s important is that you actually use the right tool for the job. Google knows this well: once they decided that MapReduce wasn’t the right tool for building the index, they stopped using it.

First, Understand the Problem

My message isn’t new, but maybe it’s the version that speaks to you, or maybe UNPHAT is memorable enough for you to apply it. If not, you might try Rich Hickey’s talk Hammock Driven Development, or the Polya book How to Solve It, or Hamming’s course The Art of Doing Science and Engineering. What we’re all imploring you to do is to think! And to actually understand the problem you are trying to solve. In Polya’s galvanic words:

It is foolish to answer a question that you do not understand. It is sad to work for an end that you do not desire.