Why cloud storage startup Box is considering OpenStack over Amazon S3

June 20, 2013

News

Why cloud storage startup Box is considering OpenStack over Amazon S3

The backend for cloud storage startup Box is pretty straightforward at the moment, said the company’s VP of engineering. But it’s considering a shift to some next-generation approaches.

Right now, the backend for cloud storage startup Box is “kind of boring,” said engineering VP Sam Schillace at GigaOM’s Structure 2013 conference. But it looks like some interesting changes could be afoot.

On stage with GigaOM senior writer Derrick Harris, Schillace said the company’s backend is pretty straightforward, relying on PHP, MySQL and filers, as well as Amazon Web Services for some backup. But he added that they’re mulling a few next-generation approaches, including a shift to OpenStack Swift.

“We’re looking really hard at OpenStack and Swift, and there are some things we want to add,” he said. “We did a bunch of modeling and it looks like Swift, if we added erasure codes to it, would be very cost-competitive and really solve our problems for a long time.”

When asked by Harris if that meant Box would use OpenStack instead of AWS, Schillace indicated that would be the case, but later explained that the company would use Swift, its own data centers and then move to Amazon’s Glacier (as opposed to Amazon’s pricier S3) for backup.

The big issue with Swift, he said, is growth, especially considering that Box doubles in size a couple of times a year.

“The challenge is mostly that we would be the biggest thing we’ve ever heard of on Swift and we’re not totally convinced we want to go first,” Schillace said.

Still, he added that the company is evaluating the option by building out proofs of concept, putting load on it and testing out what it feels like to operationalize.

Looking ahead, he said, the company will continue to invest in and deepen its platform in hopes that developers will build on top of it and ultimately contribute to a “vibrant app store.” Earlier this month, the company launched a new program called $rev that incentives developers to build apps for paying customers.

Another big focus will be on content creation as opposed to just content storage.

“[Not just] better organization but better creation tools,” he said. “So it becomes the place that you go to work and deal with business content.”

Check out the rest of our Structure 2013 coverage here, and a video embed of the session follows below:

A transcription of the video follows on the next page

The backend for cloud storage startup Box is pretty straightforward at the moment, said the company’s VP of engineering. But it’s considering a shift to some next-generation approaches.

Transcription details:
Date:
20-Jun-2013
Input sound file:
1007.Day 2 Batch 1

Transcription results:
Session Name: Inside the Box of Box

Joe Weinman
Sam Schillace
Derrick Harris

Joe Weinman 00:05
Thank you guys. I’d like to begin our next session with a quote from America’s poet laureate Theodor Geisel, “Fox and Socks, Knox in box, Fox in socks and Knox on Box, Box on Box.” Thank you very much for that dramatic reading. You’re welcome. Our next thing is, “Inside the Box of Box” that Dr. Seuss actually wrote about early and finally has come to fruition. Please join me in welcoming Derrick Harris our very own GigaOM amazing writer. And Sam Schillace – did I get that right?

[applause]
Sam Schillace 00:49
You did. No one ever gets it right.
Derrick Harris 00:51
Well you can ask first, so… Sam just for context some people aren’t aware of, Sam built Writely which became Google Apps. He is not the VP of Engineering at Box. I think with the context it’s kind of important to just get a sense of– if we start off talking like you did Google Apps now you’re working at Box. Collaboration right, but kind of different areas of collaborations we’ll say, right?
Sam Schillace 01:20
Yeah. A lot of different things going on. Do you want to talk about how collaboration has changed?
Derrick Harris 01:25
Yeah. How does it go from when you’re doing Google to doing Box, you’re serving different users and different use case.
Sam Schillace 01:32
Yeah. It’s pretty interesting. The Google stuff we build Writely urgently just for ourselves. We didn’t really understand that it was going to become what it has become originally. It was one of these things you built it, you launch it, and everybody goes, that’s brilliant. And then you’re, Yeah, that’s brilliant we’ll just keeping going with that. It was a little bit more consumer rated, and we were a little bit more dogmatic about where that was going.
Sam Schillace 01:58
So very focused on just being completely in the browser, completely in the cloud. The desktop is dead, we’re just going to go with that. The hell with Microsoft, we’re just coming up with this vision, which worked for a while but then there’s nuisance to that as it turns out. There’s inertia of people who actually don’t want to be completely in the cloud. There are use cases that work really well for the lighter documents, and we can talk about that in the change of documents. One of the interesting things I’ve learned at Box is the desktop is actually alive. People do still actually want to work on mobile, and that’s a very interesting twist that we didn’t see or have to deal with when we were doing the Google Doc stuff in the early days.
Sam Schillace 02:45
There’s still a lot of usage– a lot of use cases where you want richer documents and want move back and forth between the cloud instead of just living in the cloud. I have a more nuance appreciation of it at this point.
Derrick Harris 02:57
That’s crazy that someone would want to use the desktop.
Sam Schillace 03:02
It bugs me. It’s actually– it kind of gets under my skin sometimes because I really do think we’re ultimate– there’s a lot of things about the browser and about the Google Apps model that I really like. I think we’re going to move there over time, but you have to actually bring your customers along. You actually help people move there. There’s plenty of companies for whatever are still tied to the desktop, using the desktop – comfortable with it for whatever reason. We’re just trying to build solutions that connection both worlds. It’s kind of a long theme– big theme with Box is we’re not one of the giant modelists we’re more a neutral ground. We’re more a pragmatist. We try and play in the middle where people actually need to get work done. It’s maybe less glamorous in some ways, but we feel very strongly about delivering that kind of value.
Derrick Harris 03:52
From an engineering prospective is that a case where – I don’t want to say you’re backward – you’re working to, like you said, Work for a business users. The requirements are a little different were you still have to a have physical presence, right? How does it affect how you approach a product in design or building the application?
Sam Schillace 04:12
Yeah. That’s interesting. Obviously you always do when you’re building products, you pay a lot of attention – and you talk to your customers a lot. Yeah, it is a little bit of– it feels a little bit like moving backwards. I definitely have that engineers reflex to want to go to the right solution. At Google we had the luxury of doing that because we weren’t important to the business. There’s a vision. Eric had a vision, Larry and Sergey we were just pursuing it. A lot of folks that vision worked really well; it works really in the small business market. For other folks it doesn’t so what you wind up doing in terms of doing product development– I’ve got teams that go across the stack completely. I joke with people that I do everything except build chips.
Sam Schillace 05:09
I have infrastructure engineers and I have mobile engineers, and I have Windows desktop engineers, and I have Java script experts; we do the whole thing. We do a very sophisticated web application. We’ve got a guy named Nicholas Zakas who came out of Yahoo built this thing call YUI – he’s a pretty well-known guy in the Java script world – totally focused on our Java script, web app infrastructures, much more sophisticated in what we did in the Writely days. At the same time I’ve Win8 engineer. I’ve got a set of Android engineers. I’ve got IOS engineers; we’ve got desktop engineers doing all of this stuff. So you get like you’ve got to do all of it to make it work well.
Derrick Harris 05:44
Say you’ve got a web application – you mentioned mobile like now – it seems like it’s becoming more important to have heavy sophisticated and mobile application. What does a sophisticated and mobile application look like compared to a sophisticated web application?
Sam Schillace 05:56
This is a question that’s been on my mind a lot lately is what do you want to do with documents and content and with collaboration on mobile? It’s a hard question. One the one hand mobile is all about being very focused, very sophisticated and very simple I mean. You want to get one task done and get in and out. You’d think just if you took documents as an example, you’d probably want very simple documents to be able to just interact with chunks of them. One of the things I’ve thought about is maybe people ask you questions, and you can answer those questions or comment, or just see little pieces of it rather than trying to zoom in on a document on the one hand.
Sam Schillace 06:44
On the other hand move and more business is happening on mobile devices, tablets and phones, so you want all of the functionality that you need to get your job done to show up on those devices. So you can’t really just– what we’re doing right now is what everybody always does when a new platform emerges, which is imitate the old platform. So we’re doing these full featured rich things on the mobile devices that are just hard to deal if it’s a spreadsheet–
Derrick Harris 07:10
On your phone is possible.
Sam Schillace 07:13
Yeah. My phone’s starting to feel like a laptop again. It drives me nuts. I just got all of this junk all of these pages full of things. And applications are getting really full of controls and really crowded. But that can’t be the right answer; it can’t go back to the crappy days of the desktop. The simplicity and focus of these mobile apps makes a lot of sense, but at the same time you’ve got to actually try to figure out how to bring in all this functionality if you’re going to really move off of the desktop and into mobile as a mobile working. There’s no simple answer to that. We pay a lot of attention to use cases we’ve got lots of users for which mobile on particular tablets are really, really compelling, they really unlock their business. One of things that’s really exciting is that tablets turn people into information workers that didn’t use to be information workers.
Sam Schillace 08:03
A great classic examples are the construction company that now sends tablets out with the guys that drop off these large yellow machines for rental. Those guys are now information workers that weren’t – they were just truck drivers a few years ago. So that’s pretty cool.
Derrick Harris 08:16
Do you get a sense at a company like Box– when you’re talking about the benefits and wanting to simple– I’m trying to get sense of how – to use an overused term – consumerization of IT – how consumerized an application like Box has to be given the user base, is it something where…?
Sam Schillace 08:39
This is a great question. This is another thing we think a lot about, which is– the short answer is very. It has to be completely consumerized. And the reason is it used to be the case that your IT guy just couldn’t just dictate whatever you got. So an enterprise business would sell to the IT professionals – sell to the CIO – and he would shove the software down on the end users and that was just what it was. Mobile now completely disrupts that. So and users don’t like your product they don’t use your product, they don’t use the product then you fail – you get fired. So you have to actually serve both the end use and the CIO at the same, which means you have to actually be first class.
Sam Schillace 09:23
So part of what we do – we’ve talked about – we actually are a consumer development company in terms of how we run our engineer processes. We do all of stuff that you expect. We do daily releases. We do continuous integration. We do this massive automated testing sweep. We do all of the stuff that you do when you’re a consumer company and trying to be a very nimble and responsive to the market. We don’t do this one release a quarter, throw it over the wall hope it works kind of stuff. Sorry if we broke something you’ll get it fixed in another quarter.
Derrick Harris 09:57
Work fast and break stuff.
Sam Schillace 09:58
Yeah. We don’t do quite move fast and break stuff, because oblivious are running business on us. So we pay a lot of attention to breakage. We have a very elaborate release process that happens every day that’s designed to catch breakage before it gets out to more than 1% of our users – and hopefully no percent of our users. You have to think and act and behave like a consumer company I think to succeed as an enterprise company. I think that’s going to become more, and more the case as mobile unlocks more and more user choice in the enterprise. We see that trend.
Sam Schillace 10:35
We talked earlier what happen to desktop when the web showed up and blew apart these distribution channels had never happened to enterprise. It is not the case that these distribution channels are getting blown apart finally by mobile. And I think the same things about to happen to enterprise that we’re just going to see this explosion of creativity and interesting applications and much higher design standards and usability standards. So get ready.
Derrick Harris 11:06
So if you were to miss given the over theme of the show in infrastructure – I remember getting on Facebook yesterday talking about building network fabrics and Jeff Dean talking about building Spanner. Talk about Box underneath all of that, what does the back end look at a company like Box to support what you’re trying to do?
Sam Schillace 11:29
We’re not Facebook and we’re not Google yet. We’re not quite at that point where we get to build our own networking hardware, software, and things like that. I would love to be able to. We’re at the fortunate point of we’re big but we’re not gigantic. So we’re big enough that it matters, it hurts. But we’re not so big that we can’t use a lot of standard technology. So our back end is actually kind of boring right now. We’re just very straightforward PHP, MySQL, and filers, we don’t do a lot. We do some backup to cloud services because it’s a good operational practice to have data stored in two different systems so that a bug in one system doesn’t affect the bug in the other – so we do that.
Sam Schillace 12:11
We’re looking at next generations things. We’re looking really hard at Swift and OpenStack, and there’s some things we want to add to it. We’d like have– we did a bunch of modeling recently and it looks like Swift if we add eraser codes to it would be very cost competitive and would really solve our problems for a long time. So that’s what we’re thinking about the big challenge we have is just growth where we double in size a couple of times a year – it’s really fast growth. Were fine now but we were very much looking two or three years down the road.
Derrick Harris 12:45
When you say move to Swift, you mean Swift as opposed to Amazon?
Sam Schillace 12:49
Yeah. We’re looking at a lot of different options. One thing we’ve been talking about is Swift in our own data centers, we’re going to run our own data centers. We don’t live on Amazon we just backup to Amazon. Swift and our own data centers then moving to Glacier for the backup because Swift seems like it could be very robust solution for us. With challenges mostly we would be the biggest thing we’ve ever heard of on Swift. We’re not totally convinced we want to go first.
Derrick Harris 13:17
The pioneer tax. Like they talked about in another session.
Sam Schillace 13:20
Yeah, it would cool. We’re in the process of evaluating; building a few concepts putting a load on it seeing what it feels like operationalize something like that. Yeah. We want to do things like that. Obviously the Google infrastructure’s fun I kind of miss it. It’s a really scale – it’s a crazy big scale. Although that kind of scale also has its own challenges because it’s hard to really know what’s going on in there.
Derrick Harris 13:46
When you look like in terms of– right now you’re fine, but as Box gets more collaborative and bigger accounts and you know you have– when you get these bigger users with 30, 000 employees, whatever. What kind of infrastructure you build to start handling that, because it seems like those challenges are kind of at a precedent scale.
Sam Schillace 14:08
Yeah. This is a different kind of scale, and it’s a big problem for us so we definitely – we were talking earlier – we’re inventing our own message bus like everybody else has to invent their message bus. It’s kind of funny these problems come back up again over and over. We have what we call– there’s the Obama problem and the Scoble problem – we joked about this when I was at Google – building social networking stuff. The Obama problem is a million people follow him. The Scoble problem is he follows everybody. So it’s these two different– it’s literary true. He’s literary the scourge of – not the scourge – but the pain of all of the social networks because he’s this weird edge case where he follows everybody. And you’ve got to deal with fan out and fan in, in these weird ways.
Sam Schillace 14:50
So these things that are simple– it’s very easy to delivery notifications. It’s easy to describe I’m going to make a change here and just deliver it to all of the people that see this folder or whatever. Those are easy to describe, but then when you fire up a 50, 000-user account, which we have and they’re all in one folder or there’s 50, 000 folders inside that folder, or whatever. Those notification problems actually get pretty hard. So you have to build– that’s a whole layer. Then there’s all this stuff on if you’ve got 50, 000 you’ve got to manage them appropriately. That’s got to be– you can’t tab through something you actually have to have UI that scales to that level.
Sam Schillace 15:29
So it just gets to be– there’s lots of challenge. We do sharing in MySQL layer for this reason because can’t really keep everything on one– even the metadata doesn’t fit one machine.
Derrick Harris 15:41
Are you going to build your own database at some point, is that…?
Sam Schillace 15:43
I hope not. I don’t we’ll to build our own database. MySQL is an unrated technology. I think it actually – it’s sort of boring it’s over, it’s done. People know what it is. It’s not really moving that much, but it actually is a pretty good work horse it gets a lot of stuff done, and you can do a lot with it. I don’t think we’re going to although one of our next challenges is widely distributed multiple data centers. One of the problems of data set like the Google Docs stuff or Boxes the data set itself doesn’t shard it’s a web, any user can collaborate with any other user on any piece of data.
Sam Schillace 16:19
So there’s no way to say, You guys are all European you live in Europe, you guys are all Asian you live you live in Asia, everybody else in North America lives in the North America data set of data centers. Because it doesn’t partition you’ve got to deal with those long latencies. We won’t build our own database, but we’ll definitively have to solve that problem. And it’s a nasty problem. The latencies are so long you have to be very careful about what you ask for and you get this– you either get a distributed transaction problem or distributed cash coherency problem, kind of depending on how you slice the problem. So it gets too hard to do well, there’s no good choice in there, and [inaudible] is now your enemy.
Derrick Harris 16:58
You mentioned about scaling the UI, that’s something that I think– do you see a problem with? I’ve used Facebook and sometimes– actually I rarely use Facebook, but I used [inaudible] this is busy or something. How do you, how do you scale, and you reply to 10′s of thousands.
Sam Schillace 17:11
It totally gets– hopefully we’ll get to start working on stuff like this at some point soon. We have a feed of changes that happen and folders that you’re shared to, and it can get very overwhelming even in an organization the size of Box, like that feed is actually pretty noisy. It’s getting to be not very useful so it’s definitely– I’ll hire an expert or two, and start actually thinking about doing some kind of ranking in there and filtering and stuff like that. We might wind up having to solve that problem to.
Derrick Harris 17:45
Does Box become more intelligent to some degree in terms of what it’s displaying and notifying you of?
Sam Schillace 17:52
Yeah. You’d want it to, right? You’d probably– content is the heart of every business. A lot of what we do is optimize the way businesses use content – interact with content. So currently that optimization is relatively straight forward it’s just we’ll manage it for you, we’ll make it ubiquitous, we’ll make accessible we’ll move it off premise. But, yeah you totally want to keep climbing that stack of making your content more effective and high leverage.
Derrick Harris 18:18
Last question and I’m going to make it a generic one. But if you look at Box today versus let’s say three years from now after you guys buy IPO, or whatever. What does the product look like – how is Box a different product today than it is years out?
Sam Schillace 18:34
There is a couple of things. One is we’re going to continue to invest in and deepen the platform aspect of it. I think part of what will be different is hopefully is a lot of people will be building on top of it. We just announced our Box Reve thing, which is encouraging developers to get more engagement with a community. I think there’s a real possibility of building out a very vibrant application store with a lot of people in it. I think that would be pretty cool. I think the other big theme is– so developers one, and then the other big theme is I think content creation as well as opposed to just storage. So some of this stuff that you’ve been talking about in terms of adding value to content. So better organization but also better creation tools. It becomes the place that you go to work with and deal with business content. I think those are probably the two big things that’ll change us the most over the next few years.
Sam Schillace 19:34
Listen we’re out of time. Thanks a lot Sam.
Derrick Harris 19:35
Thank you.

http://gigaom.com/2013/06/20/why-cloud-storage-startup-box-is-considering-openstack-over-amazon-s3/

, , , ,

Subscribe

Subscribe to our RSS feed and social profiles to receive updates.

No comments yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: