Data Management Guide (3)
By Hongyu Xiao
Contact: hongyu.xiao@ou.edu
3. Data Management Guide
Learn about data handling on the cluster:
- File transfer methods (SCP, SFTP)
- Using SCP (Secure Copy Protocol):
- Command format: scp source destination
- Upload: scp localfile user@remote:/path/
- Download: scp user@remote:/path/remotefile localfile
- Example: To copy a file named "data.txt" to your home directory on the cluster:
scp data.txt username@schooner.oscer.ou.edu:~/
- Example: To copy an entire directory recursively:
scp -r local_directory/ username@schooner.oscer.ou.edu:~/destination/
- Example: To download results from the cluster:
scp username@schooner.oscer.ou.edu:~/results/output.dat ./
- Using SFTP (SSH File Transfer Protocol):
- Interactive file transfer session
- Support for multiple file operations
- Example: To start an SFTP session:
sftp username@schooner.oscer.ou.edu
- Common SFTP commands:
get remotefile # Download a file
get -r remotedir # Download a directory
pwd # Show remote directory
cd directory # Change remote directory
ls # List remote files
quit # Exit SFTP session
- Example: Download a file named "results.txt":
sftp username@schooner.oscer.ou.edu
sftp> cd results
sftp> get results.txt
sftp> quit
- Using SCP (Secure Copy Protocol):
- Directory structure organization
- Best practices for organizing your data on the cluster:
- Keep project files in separate directories
(e.g., /TL_OK_OriginData=_LR, /TL_OK_1Mil_LR)
- Use consistent naming conventions for files and folders
- Create a README file in each project directory describing contents
- Organize input data, scripts, and output results in distinct subdirectories
- Keep project files in separate directories
- Best practices for organizing your data on the cluster:
- Storage quotas and limitations
Storage quotas on the cluster are designed to manage resources effectively:
- Home directory quota: 20GB per user
- Scratch space: large shared temporary storage, files older than 30 days are automatically removed
- Project space: check /ourdisk
- Backup strategies
- Keep local copies of critical files on your personal computer
- Consider using version control (e.g., Git) for code and documentation
- Archive completed project data to prevent accidental loss