Access Data

Cloud Hosted Archive

Using Serverless Applications to Process Seismograms

AWS Lambda is a service that allows the user to execute code without a server. The cost you pay is only for when code is run. We outline here an example where using AWS Lambda, we take one month of 40 sps data, decimate it by a factor of 4, store the output in S3. The entire process takes 11 minutes and $0.68. Under the traditional model, you'd probably spend more than 11 minutes just downloading the data! The cloud archive allows us to remove the step of downloading a copy.

Task

Decimate waveforms recorded January 2016 from 40 sps to 10 sps.
Total size of data is 122 GB. 23,262 files.

How

  • In Docker container running on Amazon Linux to produce a zip file which contains ObsPy and lambda function.
  • zip file uploaded to Lambda.
  • A script takes each file - and calls lambda function and then writes to S3
  • Files for this demo can be found in the References section below

Results

  • Time to run (parallelizing lambda) 11 minutes. 20-28 concurrent Lambda executions taking 13,566 seconds total
  • Cost of Lambda - none - under Lambda free tier limit of 400,000 GB-seconds per month
  • Cost to put into personal S3 and storage - $0.11 for puts, $0.57 for S3 storage (27GB)

References