Amazon as Infrastructure for OpenGeo Data ?

Charlie Savage has a post worth reading. He says:

In addition to storing the tiles, you’d need machines to store the original vector data, render the tiles and serving them on the Web. And we haven’t even talked about imagery yet, which takes even more space. Even then, having the hardware isn’t enough. You’d also have to have the expertise to run it all, which perhaps is Google’s most treasured secret.

Lately I’ve been exploring Amazon Web Services. I wonder if a combination of S3, EC2 and SQS could provide the infrastructure. S3 and SQS are straightforward. My next step involves EC2: build an Amazon Machine Image (AMI) that runs OSGeo software. It looks like it will be a challenge. Has anyone done this?

The more compelling EC2 sites allow users to extract/transform/load and recombine their own data with others. Jamglue does it with Audio. Very slick, check out the tour!. Pixnate does it with imagery. Why can’t we do it with spatial data?

steve martin
I can’t help it, but every time I look into OpenSource and run into a problem I think of Steve Martin’s epiphany in “The Jerk”: “Ahh, so its a profit making deal!”.

With that in mind, maybe the Amazon Web Services Start-up Challenge will get some open source geo folks busy building some spatially aware AMIs.

One possible idea for the challenge might be a site that lets users upload, georeference and publish their own spatial data. Maybe it could even be made available for purchase at an Amazon store?

Look at it from Amazon’s perspective: all the other big players (Microsoft, Google, and Yahoo) are all heavily involved in geospatial data. Amazon is missing the boat. Here’s their chance to catch up.

Navteq and Teleatlas provide spatial data for GPS receivers in the same way iTunes provides data for iPods. Since Amazon is now competing with iTunes by selling DRM-free MP3 files, maybe they could do something similar to compete with TeleAtlas/Navteq?

Of course the difficulty would be incentivizing data collectors to take the effort to get the same quality offered by NavTeq/Teleatlas. Maybe Inrix’s smart dust concept could be extended? The coverage of the Inrix real-time data is disappointing, they need more data collectors. As city-wide WiFi becomes commonplace, WiFi enabled GPS will follow. As traffic gets worse, demand for real time traffic data will increase. Amazon could sell this data by giving data collectors a discount. Every time you drive to work with your GPS on, you are collecting potentially valuable data. More on this later.


2 comments so far

  1. Paul Bissett on

    Have you seen what we are doing at WeoGeo. Incentivizing geodata collectors and geocontent providers is exactly what we are trying to do. Combining content, putting it back up for sale, and protecting everyone in the content stack is an important component of our business model for geodata Providers and Users.

    Our whole architecture is built on AWS EC2 and S3. You may wish to review a a couple of posts today on my blog that specifically address the functionality (and this weekend’s outage) of EC2.

  2. josh l on

    I had no great trouble setting up ec2 with a basic gis stack (postgis/mapserver). Amazon’s instructions are quite good.

    I started with one of the ubuntu AMI’s. At that point you could just use apt-get and be done with it, but if you want the latest stuff you can follow something similar to aaron racicot’s notes at

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: