Skip to main content

Amazon S3 Integration for Nodus

1. Introduction to the Amazon S3 Integration

What Is This Integration?

The Amazon S3 integration connects your S3 storage buckets with Nodus, allowing you to extract, analyze, and transform data from various file types stored in your buckets. This integration functions as a source connector, bringing your S3 files into the Nodus ecosystem for advanced analytics and business intelligence.

Prerequisites:

  • An active AWS account with access to S3
  • S3 bucket containing data files (CSV, JSON, Excel, or Parquet format)
  • AWS Access Key and Secret Key with read permissions to the S3 bucket
  • Knowledge of the file prefix and structure in your S3 bucket

Connection Overview:

The integration uses Amazon's S3 API to securely extract data from your storage buckets. Authentication is handled through AWS credentials, and the connector supports various file formats and configuration options to match your specific data storage patterns.

2. Platform Setup Documentation (Setup Form for Amazon S3)

Purpose & Scope

This section covers how to set up the initial connection between Nodus and Amazon S3 by providing the necessary authentication credentials and bucket details.

Field-by-Field Breakdown:

Integration Name

  • Field Name & Label: Integration Name
  • Description & Purpose: A descriptive name to identify this Amazon S3 integration within your Nodus account.
  • Validation Rules & Format: Text string, required field.
  • Examples: "Marketing Data Bucket", "Finance Reports S3"
  • Troubleshooting Tips: Use a descriptive name that clearly identifies the specific S3 bucket or data purpose.

Bucket Name

  • Field Name & Label: Bucket Name
  • Description & Purpose: The name of your S3 bucket containing the files to extract.
  • Validation Rules & Format: Text string following S3 bucket naming rules, required field.
  • Examples: "my-company-data", "marketing-analytics-files"
  • Troubleshooting Tips: Enter the bucket name exactly as it appears in your AWS S3 console, without any prefix or folder paths.

Access Key

  • Field Name & Label: Access Key
  • Description & Purpose: The AWS Access Key ID used to authenticate with S3.
  • Validation Rules & Format: AWS Access Key format (typically 20 characters), required field.
  • Examples: "AKIAIOSFODNN7EXAMPLE"
  • Troubleshooting Tips: Use an access key associated with a user or role that has at least read-only permissions to the specified bucket.

Secret Key

  • Field Name & Label: Secret Key
  • Description & Purpose: The AWS Secret Access Key paired with the Access Key.
  • Validation Rules & Format: AWS Secret Key format (typically 40 characters), required field.
  • Examples: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
  • Troubleshooting Tips: Never share your secret key and ensure it has appropriate permissions to access the bucket.

Step-by-Step Guide:

  1. Create IAM user or use existing user with S3 access

  2. Create IAM policy with required permissions:

    {    "Version": "2012-10-17",    "Statement": [        {        "Effect": "Allow",        "Action": [                "s3:GetBucketLocation",                "s3:GetObject",                "s3:ListBucket"        ],        "Resource": [                "arn:aws:s3:::{your-bucket-name}/*",                "arn:aws:s3:::{your-bucket-name}"        ]        }    ]}

  3. Attach the policy to your IAM user

  4. Generate Access Key and Secret Key from the IAM console

  5. Enter your bucket name, Access Key, Secret Key, and a meaningful Integration Name in the Nodus setup form

  6. Save the configuration

3. Extraction/Query Configuration (Extraction Form for Amazon S3)

Purpose & Overview

This section explains how to configure data extraction from Amazon S3. The integration allows you to specify file prefixes and formats to extract exactly the data you need.

Template & Field Documentation:

S3 Configuration Section

S3 Prefix

  • Field Name & Label: S3 Prefix
  • Description & Purpose: Specifies the folder path or prefix to filter S3 bucket contents
  • Validation Rules & Format: Text string, required field
  • Examples: "data/", "reports/2023/", "logs/application/"
  • Troubleshooting Tips: The prefix is case-sensitive and should include the trailing slash for folder paths. To extract from the root of the bucket, leave blank or use an empty string.

File Type

  • Field Name & Label: File Type
  • Description & Purpose: Defines the file format to extract data from
  • Validation Rules & Format: Radio selection, required field
  • Available Options:
    • CSV - Comma-separated values files
    • JSON - JSON formatted files
    • Excel - Microsoft Excel files (.xlsx, .xls)
    • Parquet - Apache Parquet columnar storage files
  • Troubleshooting Tips: Select the format that matches your files. Files not matching the selected format will be ignored.

Extraction Date Selection

Historic Date

  • Field Name & Label: Historic Date
  • Description & Purpose: Specifies the date range for files to extract based on file modification dates
  • Validation Rules & Format: Date picker, required field
  • Troubleshooting Tips: Files modified outside this date range will not be extracted. S3 tracks file modification dates in UTC timezone.

Workflow & Examples:

  1. Enter the S3 prefix to target specific folders or file patterns
  2. Select the appropriate file type (CSV, JSON, Excel, or Parquet)
  3. Choose a historical date range based on file modification dates
  4. Preview the query to confirm configuration
  5. Execute extraction

Example Use Cases:

Daily Sales Reports:

  • Prefix: "sales/daily/"
  • File Type: CSV
  • Date Range: Previous 90 days

Financial Data Analysis:

  • Prefix: "finance/quarterly-reports/"
  • File Type: Excel
  • Date Range: Current year to date

Log Analysis:

  • Prefix: "logs/application/"
  • File Type: JSON
  • Date Range: Previous 7 days

4. Troubleshooting & FAQs for Amazon S3

Common Issues & Error Messages

Authentication Failures

  • Error: "Access denied" or "Invalid credentials"
  • Solution: Verify your Access Key and Secret Key are correct and not expired. Check that the IAM user has the correct permissions for the bucket.

File Not Found

  • Error: "No files matching the specified prefix and file type"
  • Solution: Verify the prefix is correct and that files of the specified type exist in that location. Remember that S3 prefixes are case-sensitive.

Permission Issues

  • Error: "Not authorized to perform s3:ListBucket" or "Not authorized to perform s3:GetObject"
  • Solution: Ensure your IAM policy grants both ListBucket and GetObject permissions for the bucket and objects.

Data Format Issues

  • Error: "Error parsing file" or "Invalid file format"
  • Solution: Verify the selected file type matches the actual format of your files. Check for any corrupted files or non-standard formatting.

Date Range Filtering

  • Issue: Files outside the specified date range are being extracted
  • Solution: The date filter applies to the file modification date in S3, not any dates within the file content. Verify file modification dates match your expectations.

Contact & Support Information

S3 Best Practices

  • Organize files with clear prefix structures for easier extraction
  • Use consistent file formats within folders
  • Consider using S3 lifecycle policies to manage file retention
  • Use IAM roles instead of Access Keys when possible for improved security
  • Consider enabling S3 versioning for critical data