Access Data
Cloud Hosted Archive
Using Serverless Applications to Process Seismograms
AWS Lambda is a service that allows the user to execute code without a server. The cost you pay is only for when code is run. We outline here an example where using AWS Lambda, we take one month of 40 sps data, decimate it by a factor of 4, store the output in S3. The entire process takes 11 minutes and $0.68. Under the traditional model, you'd probably spend more than 11 minutes just downloading the data! The cloud archive allows us to remove the step of downloading a copy.
Task
Decimate waveforms recorded January 2016 from 40 sps to 10 sps.Total size of data is 122 GB. 23,262 files.
How
- In Docker container running on Amazon Linux to produce a zip file which contains ObsPy and lambda function.
- zip file uploaded to Lambda.
- A script takes each file - and calls lambda function and then writes to S3
- Files for this demo can be found in the References section below
Results
- Time to run (parallelizing lambda) 11 minutes. 20-28 concurrent Lambda executions taking 13,566 seconds total
- Cost of Lambda - none - under Lambda free tier limit of 400,000 GB-seconds per month
- Cost to put into personal S3 and storage - $0.11 for puts, $0.57 for S3 storage (27GB)
References
- Code for Lambda example Scripts are in SCEDC's GitHub repo.
- ObsPy Project
- AWS Getting Started Resource Center Starter page for learning about what types of AWS resources are available.