Next Previous Contents

Large scale data analysis, HOWTO?

Gong Cheng cheng_gong@yahoo.com

V0.51 2006-03-30
Linux is becoming a popular platform for data analysts if not standard. Comparing with commercial operating system, such as Windows and OS X, Linux platform has the advantage of flexible shell scripting language, high stability and low cost. It is the ideal platform to handle multi Giga byte data. The prosperous development of the Linux community also makes the learning of Linux a life long journey. A task in Linux is often conducted by a sophasticated combination of different basic tools. A deep understanding of all the avaiable tools in Linux may take years. The goal of this HOWTO is to get the reader started on data analysis by covering the most commonly used techniques and considerations in minimal amount of paragraph.

1. Introduction

2. Hardware and general considerations for handling large scale data analysis

3. Moving around with your data

4. Plotting/Visualization of large scale data

5. Programming with large scale data

6. Advanced Issues

7. Troubleshooting

8. Getting Help

9. Concluding Remarks


Next Previous Contents