Amazon S3 Integration for Nodus
1. Introduction to the Amazon S3 Integration
What Is This Integration?
The Amazon S3 integration connects your S3 storage buckets with Nodus, allowing you to extract, analyze, and transform data from various file types stored in your buckets. This integration functions as a source connector, bringing your S3 files into the Nodus ecosystem for advanced analytics and business intelligence.
Prerequisites:
- An active AWS account with access to S3
- S3 bucket containing data files (CSV, JSON, Excel, or Parquet format)
- AWS Access Key and Secret Key with read permissions to the S3 bucket
- Knowledge of the file prefix and structure in your S3 bucket
Connection Overview:
The integration uses Amazon's S3 API to securely extract data from your storage buckets. Authentication is handled through AWS credentials, and the connector supports various file formats and configuration options to match your specific data storage patterns.
2. Platform Setup Documentation (Setup Form for Amazon S3)
Purpose & Scope
This section covers how to set up the initial connection between Nodus and Amazon S3 by providing the necessary authentication credentials and bucket details.
Field-by-Field Breakdown:
Integration Name
- Field Name & Label: Integration Name
- Description & Purpose: A descriptive name to identify this Amazon S3 integration within your Nodus account.
- Validation Rules & Format: Text string, required field.
- Examples: "Marketing Data Bucket", "Finance Reports S3"
- Troubleshooting Tips: Use a descriptive name that clearly identifies the specific S3 bucket or data purpose.
Bucket Name
- Field Name & Label: Bucket Name
- Description & Purpose: The name of your S3 bucket containing the files to extract.
- Validation Rules & Format: Text string following S3 bucket naming rules, required field.
- Examples: "my-company-data", "marketing-analytics-files"
- Troubleshooting Tips: Enter the bucket name exactly as it appears in your AWS S3 console, without any prefix or folder paths.
Access Key
- Field Name & Label: Access Key
- Description & Purpose: The AWS Access Key ID used to authenticate with S3.
- Validation Rules & Format: AWS Access Key format (typically 20 characters), required field.
- Examples: "AKIAIOSFODNN7EXAMPLE"
- Troubleshooting Tips: Use an access key associated with a user or role that has at least read-only permissions to the specified bucket.
Secret Key
- Field Name & Label: Secret Key
- Description & Purpose: The AWS Secret Access Key paired with the Access Key.
- Validation Rules & Format: AWS Secret Key format (typically 40 characters), required field.
- Examples: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
- Troubleshooting Tips: Never share your secret key and ensure it has appropriate permissions to access the bucket.
External Link
- Link Label: "S3 Documentation"
- URL: https://docs.aws.amazon.com/s3/
- Purpose: Provides access to official AWS S3 documentation for additional help.
Step-by-Step Guide:
Create IAM user or use existing user with S3 access
Create IAM policy with required permissions:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:GetBucketLocation", "s3:GetObject", "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::{your-bucket-name}/*", "arn:aws:s3:::{your-bucket-name}" ] } ]}
Attach the policy to your IAM user
Generate Access Key and Secret Key from the IAM console
Enter your bucket name, Access Key, Secret Key, and a meaningful Integration Name in the Nodus setup form
Save the configuration
Reference Links:
3. Extraction/Query Configuration (Extraction Form for Amazon S3)
Purpose & Overview
This section explains how to configure data extraction from Amazon S3. The integration allows you to specify file prefixes and formats to extract exactly the data you need.
Template & Field Documentation:
S3 Configuration Section
S3 Prefix
- Field Name & Label: S3 Prefix
- Description & Purpose: Specifies the folder path or prefix to filter S3 bucket contents
- Validation Rules & Format: Text string, required field
- Examples: "data/", "reports/2023/", "logs/application/"
- Troubleshooting Tips: The prefix is case-sensitive and should include the trailing slash for folder paths. To extract from the root of the bucket, leave blank or use an empty string.
File Type
- Field Name & Label: File Type
- Description & Purpose: Defines the file format to extract data from
- Validation Rules & Format: Radio selection, required field
- Available Options:
- CSV - Comma-separated values files
- JSON - JSON formatted files
- Excel - Microsoft Excel files (.xlsx, .xls)
- Parquet - Apache Parquet columnar storage files
- Troubleshooting Tips: Select the format that matches your files. Files not matching the selected format will be ignored.
Extraction Date Selection
Historic Date
- Field Name & Label: Historic Date
- Description & Purpose: Specifies the date range for files to extract based on file modification dates
- Validation Rules & Format: Date picker, required field
- Troubleshooting Tips: Files modified outside this date range will not be extracted. S3 tracks file modification dates in UTC timezone.
Workflow & Examples:
- Enter the S3 prefix to target specific folders or file patterns
- Select the appropriate file type (CSV, JSON, Excel, or Parquet)
- Choose a historical date range based on file modification dates
- Preview the query to confirm configuration
- Execute extraction
Example Use Cases:
Daily Sales Reports:
- Prefix: "sales/daily/"
- File Type: CSV
- Date Range: Previous 90 days
Financial Data Analysis:
- Prefix: "finance/quarterly-reports/"
- File Type: Excel
- Date Range: Current year to date
Log Analysis:
- Prefix: "logs/application/"
- File Type: JSON
- Date Range: Previous 7 days
4. Troubleshooting & FAQs for Amazon S3
Common Issues & Error Messages
Authentication Failures
- Error: "Access denied" or "Invalid credentials"
- Solution: Verify your Access Key and Secret Key are correct and not expired. Check that the IAM user has the correct permissions for the bucket.
File Not Found
- Error: "No files matching the specified prefix and file type"
- Solution: Verify the prefix is correct and that files of the specified type exist in that location. Remember that S3 prefixes are case-sensitive.
Permission Issues
- Error: "Not authorized to perform s3:ListBucket" or "Not authorized to perform s3:GetObject"
- Solution: Ensure your IAM policy grants both ListBucket and GetObject permissions for the bucket and objects.
Data Format Issues
- Error: "Error parsing file" or "Invalid file format"
- Solution: Verify the selected file type matches the actual format of your files. Check for any corrupted files or non-standard formatting.
Date Range Filtering
- Issue: Files outside the specified date range are being extracted
- Solution: The date filter applies to the file modification date in S3, not any dates within the file content. Verify file modification dates match your expectations.
Contact & Support Information
- Nodus Support: support@nodus.com
S3 Best Practices
- Organize files with clear prefix structures for easier extraction
- Use consistent file formats within folders
- Consider using S3 lifecycle policies to manage file retention
- Use IAM roles instead of Access Keys when possible for improved security
- Consider enabling S3 versioning for critical data