Back to list
AWS上に月々100ドルで生産用のAIプラットフォームを_Deploy_しましょう
Deploy a Production AI Platform on AWS for $100/month
Translated: 2026/3/7 9:43:43
Japanese Translation
7つの壊れたLambada関数から、現在の生産性のための完全なプラットフォームへと進む旅です。これらの関数は通信が難しくてタイムアウトし、ユーザーにローリングウィナーシンボルを待たせました。現在は複雑なワークフローのオーケストレーション、リアルタイム更新の提供を行い、起業家には財政的に破壊的なものではありません。この話題では実際に私がお見せするのは動作するものです。そのアーキテクチャが日々1,500以上の要求を処理し、8ヶ月の運用経験から生み出され且つドキュメント分析から複数ステップの研究タスクまで可能なものだとしています。
Original Content
From seven broken Lambda functions to a production AI platform in 8 articles.
That's the journey we've taken together. Functions that couldn't communicate, hit timeout walls, and left users staring at loading spinners. Now you get a complete platform that orchestrates complex workflows, streams real-time updates, and won't bankrupt your startup.
This isn't a toy example. The architecture I'm about to show you serves 1,500+ requests daily, has survived 8 months in production, and handles everything from document analysis to multi-step research tasks.
Time to deploy it.
Before we dive into deployment, here's what we're building:
API Gateway receives requests, handles auth, enforces rate limits
Gateway Lambda validates requests, checks budgets, routes to appropriate service
ECS Agents orchestrate multi-step workflows using Lambda tools
Lambda Tools perform specific AI tasks (summarize, extract, classify)
DynamoDB tracks usage, manages budgets, stores user data
WebSocket streams real-time updates back to clients
First, let's set up the deployment environment:
# Install AWS CDK
npm install -g aws-cdk
# Clone the platform
git clone https://github.com/tysoncung/ai-platform-aws.git
cd ai-platform-aws
# Install dependencies
npm install
npm run install:all # Installs in all packages
# Bootstrap CDK (one time per account/region)
npx cdk bootstrap
# Create environment file
cp .env.example .env
Edit .env with your configuration:
# AWS Configuration
AWS_REGION=us-east-1
AWS_ACCOUNT_ID=123456789012
# AI Provider API Keys
OPENAI_API_KEY=sk-your-openai-key
ANTHROPIC_API_KEY=sk-ant-your-anthropic-key
# Platform Configuration
PLATFORM_ENVIRONMENT=production
COST_TRACKING_ENABLED=true
BUDGET_ALERTS_ENABLED=true
# Monitoring
SLACK_WEBHOOK_URL=https://hooks.slack.com/your-webhook
ALERT_EMAIL=you@company.com
# Security
JWT_SECRET_KEY=your-super-secret-jwt-key
ENCRYPTION_SALT=your-encryption-salt
Before deploying to AWS, let's run everything locally with Docker Compose:
# docker-compose.yml
version: '3.8'
services:
api-gateway:
build:
context: ./packages/gateway
dockerfile: Dockerfile.dev
ports:
- "3000:3000"
environment:
- NODE_ENV=development
- DYNAMODB_ENDPOINT=http://dynamodb:8000
- AGENT_ENDPOINT=http://agent:3001
depends_on:
- dynamodb
- agent
agent:
build:
context: ./packages/agents
dockerfile: Dockerfile.dev
ports:
- "3001:3001"
environment:
- NODE_ENV=development
- LAMBDA_ENDPOINT=http://lambda-tools:3002
depends_on:
- lambda-tools
lambda-tools:
build:
context: ./packages/tools
dockerfile: Dockerfile.dev
ports:
- "3002:3002"
environment:
- NODE_ENV=development
- OPENAI_API_KEY=${OPENAI_API_KEY}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
dynamodb:
image: amazon/dynamodb-local:latest
ports:
- "8000:8000"
command: ["-jar", "DynamoDBLocal.jar", "-sharedDb", "-inMemory"]
redis:
image: redis:7-alpine
ports:
- "6379:6379"
Start the local environment:
# Start all services
docker-compose up -d
# Run database migrations
npm run db:migrate:local
# Seed with sample data
npm run db:seed:local
# Test the platform
curl http://localhost:3000/health
The platform is composed of multiple CDK stacks for better separation of concerns:
// bin/deploy.ts
import { AIGatewayStack } from '../lib/gateway-stack';
import { AIAgentsStack } from '../lib/agents-stack';
import { AIToolsStack } from '../lib/tools-stack';
import { AIMonitoringStack } from '../lib/monitoring-stack';
import { AISecurityStack } from '../lib/security-stack';
const app = new cdk.App();
const env = {
account: process.env.CDK_DEFAULT_ACCOUNT,
region: process.env.CDK_DEFAULT_REGION
};
// Security layer (VPC, IAM, KMS)
const securityStack = new AISecurityStack(app, 'AISecurityStack', { env });
// Lambda tools layer
const toolsStack = new AIToolsStack(app, 'AIToolsStack', {
env,
vpc: securityStack.vpc,
securityGroup: securityStack.lambdaSecurityGroup
});
// ECS agents layer
const agentsStack = new AIAgentsStack(app, 'AIAgentsStack', {
env,
vpc: securityStack.vpc,
securityGroup: securityStack.ecsSecurityGroup,
toolsArns: toolsStack.functionArns
});
// API Gateway layer
const gatewayStack = new AIGatewayStack(app, 'AIGatewayStack', {
env,
agentsCluster: agentsStack.cluster,
agentsService: agentsStack.service,
toolsArns: toolsStack.functionArns
});
// Monitoring and alerting
new AIMonitoringStack(app, 'AIMonitoringStack', {
env,
gatewayApi: gatewayStack.api,
agentsService: agentsStack.service,
toolsFunctions: toolsStack.functions
});
Here's the gateway stack implementation:
// lib/gateway-stack.ts
export class AIGatewayStack extends cdk.Stack {
public readonly api: apigateway.RestApi;
constructor(scope: Construct, id: string, props: AIGatewayStackProps) {
super(scope, id, props);
// DynamoDB tables
const usageTable = new dynamodb.Table(this, 'UsageTable', {
tableName: 'ai-platform-usage',
partitionKey: { name: 'userId', type: dynamodb.AttributeType.STRING },
sortKey: { name: 'timestamp', type: dynamodb.AttributeType.NUMBER },
billingMode: dynamodb.BillingMode.ON_DEMAND,
timeToLiveAttribute: 'ttl'
});
const budgetTable = new dynamodb.Table(this, 'BudgetTable', {
tableName: 'ai-platform-budgets',
partitionKey: { name: 'userId', type: dynamodb.AttributeType.STRING },
billingMode: dynamodb.BillingMode.ON_DEMAND
});
// Gateway Lambda function
const gatewayFunction = new lambda.Function(this, 'GatewayFunction', {
runtime: lambda.Runtime.NODEJS_18_X,
code: lambda.Code.fromAsset('packages/gateway/dist'),
handler: 'index.handler',
timeout: cdk.Duration.seconds(30),
memorySize: 512,
environment: {
USAGE_TABLE_NAME: usageTable.tableName,
BUDGET_TABLE_NAME: budgetTable.tableName,
AGENTS_CLUSTER_ARN: props.agentsCluster.clusterArn,
AGENTS_SERVICE_ARN: props.agentsService.serviceArn,
TOOLS_ARNS: JSON.stringify(props.toolsArns)
}
});
// Grant permissions
usageTable.grantReadWriteData(gatewayFunction);
budgetTable.grantReadWriteData(gatewayFunction);
// API Gateway
this.api = new apigateway.RestApi(this, 'AIApi', {
restApiName: 'AI Platform API',
description: 'AI Platform REST API',
defaultCorsPreflightOptions: {
allowOrigins: apigateway.Cors.ALL_ORIGINS,
allowMethods: apigateway.Cors.ALL_METHODS,
allowHeaders: ['Content-Type', 'Authorization']
}
});
// API Gateway integration
const lambdaIntegration = new apigateway.LambdaIntegration(gatewayFunction);
// Routes
const v1 = this.api.root.addResource('v1');
v1.addResource('complete').addMethod('POST', lambdaIntegration);
v1.addResource('embed').addMethod('POST', lambdaIntegration);
v1.addResource('stream').addMethod('POST', lambdaIntegration);
const agents = v1.addResource('agents');
agents.addResource('run').addMethod('POST', lambdaIntegration);
agents.addResource('stream').addMethod('POST', lambdaIntegration);
// Usage and budget endpoints
const usage = v1.addResource('usage');
usage.addMethod('GET', lambdaIntegration); // Get usage stats
usage.addResource('budget').addMethod('GET', lambdaIntegration);
usage.addResource('budget').addMethod('PUT', lambdaIntegration);
// WebSocket API for streaming
const webSocketApi = new apigatewayv2.WebSocketApi(this, 'StreamingAPI', {
apiName: 'AI Platform Streaming',
connectRouteOptions: {
integration: new apigatewayv2integrations.WebSocketLambdaIntegration(
'ConnectIntegration',
gatewayFunction
)
},
disconnectRouteOptions: {
integration: new apigatewayv2integrations.WebSocketLambdaIntegration(
'DisconnectIntegration',
gatewayFunction
)
},
defaultRouteOptions: {
integration: new apigatewayv2integrations.WebSocketLambdaIntegration(
'DefaultIntegration',
gatewayFunction
)
}
});
new apigatewayv2.WebSocketStage(this, 'StreamingStage', {
webSocketApi,
stageName: 'prod',
autoDeploy: true
});
}
}
Now let's deploy everything:
# 1. Validate CDK configuration
npx cdk doctor
# 2. Review what will be deployed
npx cdk diff
# 3. Deploy security stack first
npx cdk deploy AISecurityStack
# 4. Deploy Lambda tools
npx cdk deploy AIToolsStack
# 5. Deploy ECS agents
npx cdk deploy AIAgentsStack
# 6. Deploy API Gateway
npx cdk deploy AIGatewayStack
# 7. Deploy monitoring
npx cdk deploy AIMonitoringStack
# Or deploy everything at once
npx cdk deploy --all
The deployment takes about 15 minutes. You'll see output like:
AIGatewayStack.APIEndpoint = https://abc123.execute-api.us-east-1.amazonaws.com/v1
AIGatewayStack.WebSocketEndpoint = wss://def456.execute-api.us-east-1.amazonaws.com/prod
AIAgentsStack.ClusterName = ai-platform-agents
AIToolsStack.SummarizeFunctionArn = arn:aws:lambda:us-east-1:123456789012:function:summarize
Once deployed, configure your AI provider credentials:
# Store API keys in AWS Systems Manager
aws ssm put-parameter \
--name "/ai-platform/openai-api-key" \
--value "sk-your-openai-key" \
--type "SecureString"
aws ssm put-parameter \
--name "/ai-platform/anthropic-api-key" \
--value "sk-ant-your-anthropic-key" \
--type "SecureString"
# Update the deployed functions with the new parameter names
npx cdk deploy AIToolsStack AIGatewayStack
Let's test the complete platform:
# 1. Health check
curl https://your-api-endpoint.execute-api.us-east-1.amazonaws.com/v1/health
# 2. Create an API key
curl -X POST https://your-api-endpoint/v1/auth/keys \
-H "Content-Type: application/json" \
-d '{
"name": "Test Key",
"scopes": ["ai:complete", "ai:embed", "agent:run"],
"monthlyBudget": 50
}'
# Returns: {"apiKey": "sk-proj-abc123...", "keyId": "sk-proj-abc"}
# 3. Test completion
curl -X POST https://your-api-endpoint/v1/complete \
-H "Authorization: Bearer sk-proj-abc123..." \
-H "Content-Type: application/json" \
-d '{
"messages": [{"role": "user", "content": "Write a haiku about TypeScript"}],
"model": "gpt-4",
"temperature": 0.8
}'
# 4. Test agent workflow
curl -X POST https://your-api-endpoint/v1/agents/run \
-H "Authorization: Bearer sk-proj-abc123..." \
-H "Content-Type: application/json" \
-d '{
"type": "research",
"input": {"topic": "renewable energy trends"},
"tools": ["search", "summarize", "extract"]
}'
The platform includes a built-in dashboard at /dashboard. Here's what you'll see:
Usage Overview:
Requests per day/hour
Token consumption by model
Cost breakdown by user
Success/error rates
Real-time Monitoring:
Active agent sessions
Queue depth for tools
Response time percentiles
Error alerts
Budget Management:
Per-user spend tracking
Budget utilization alerts
Cost projections
BYOK vs platform credit usage
System Health:
Lambda cold start metrics
ECS task utilization
DynamoDB performance
API Gateway latency
You can access it at: https://your-api-endpoint/dashboard
Here are the real metrics from 8 months running in production:
Latency (P95):
Simple completion: 1.2s
Streaming completion: 180ms to first token
Agent workflow (3 tools): 12s
API Gateway overhead: 45ms
Lambda cold start: 850ms (mitigated with provisioned concurrency)
Throughput:
Sustained: 50 requests/second
Burst: 200 requests/second (before rate limiting)
Agent concurrency: 15 parallel workflows
Tool execution: 100 parallel Lambda invocations
Reliability:
Uptime: 99.8%
Error rate: 0.4%
P99 latency SLA: 5s (met 98.9% of the time)
Budget enforcement accuracy: 99.99%
Cost Optimization Wins:
Response caching: 25% reduction in API calls
Smart model selection: 40% cost reduction (Claude Haiku for summaries)
BYOK adoption: 70% of users, eliminating platform AI costs
Lambda right-sizing: 30% reduction in compute costs
Fixed Infrastructure (Monthly):
API Gateway: $3.50 (1M requests)
Lambda (Gateway): $8.20 (compute + requests)
ECS Fargate: $15.40 (2 tasks avg)
DynamoDB: $6.80 (usage + budgets)
Application Load Balancer: $16.20
NAT Gateway: $45.00 (data transfer)
CloudWatch: $4.30 (logs + metrics)
Route 53: $0.50 (hosted zone)
----
Total Fixed: $99.90/month
Variable Costs:
AI API costs: Pass-through with 2% platform markup
Data transfer: $0.09/GB out of AWS
Lambda executions: $0.20 per million requests
DynamoDB reads/writes: $0.25 per million operations
Real customer costs (excluding AI API):
Light usage (500 req/month): $12/month
Medium usage (5K req/month): $35/month
Heavy usage (50K req/month): $120/month
The platform is cost-effective for most use cases. The break-even point vs building your own infrastructure is around 2,000 requests per month.
Lambda cold starts were killing our performance. Here's how we solved it:
// Provisioned concurrency for critical functions
new lambda.Function(this, 'GatewayFunction', {
// ... other config
reservedConcurrencyLimit: 10,
provisionedConcurrencyConfig: {
provisionedConcurrentExecutions: 5
}
});
// Keep-warm function that pings Lambdas every 5 minutes
new events.Rule(this, 'KeepWarmRule', {
schedule: events.Schedule.rate(cdk.Duration.minutes(5)),
targets: [
new targets.LambdaFunction(gatewayFunction, {
event: events.RuleTargetInput.fromObject({ warmup: true })
})
]
});
// In Lambda handler - respond quickly to warmup
export const handler = async (event: any) => {
if (event.warmup) {
return { statusCode: 200, body: 'warm' };
}
// Normal processing...
};
Result: Cold start rate dropped from 23% to 3% of requests.
This platform is completely open source. Here's what's coming next:
Q2 2026:
[ ] Multi-region deployment support
[ ] GraphQL API alongside REST
[ ] Built-in vector database (Pinecone integration)
[ ] Advanced agent memory management
Q3 2026:
[ ] Kubernetes support (alternative to ECS)
[ ] Multi-tenant isolation improvements
[ ] Advanced cost optimization (spot instances)
[ ] Plugin system for custom tools
Q4 2026:
[ ] Edge deployment (CloudFlare Workers)
[ ] Real-time collaboration features
[ ] Advanced monitoring and observability
[ ] Enterprise SSO integration
Community Requests:
Google Cloud and Azure support
Terraform modules (alternative to CDK)
Python SDK alongside TypeScript
Zapier/Make.com integrations
The entire platform is open source under MIT license. Everything I've built, you can use, modify, and improve.
Repositories:
Main platform: github.com/tysoncung/ai-platform-aws
Working examples: github.com/tysoncung/ai-platform-aws-examples
How to help:
Star the repositories - helps others discover the project
Try the full deployment - example 07-full-stack has everything
Report deployment issues - especially AWS region differences
Submit improvements - see CONTRIBUTING.md for guidelines
Share your experience - what are you building with it?
Connect:
Email: tyson@hivo.co
Twitter: @tysoncung
What We Built Together
Eight articles. One complete AI platform.
We started with seven broken Lambda functions. We built:
Agent orchestration that handles complex multi-step workflows without timeouts
TypeScript SDK with perfect IntelliSense, streaming support, and smart error handling
Cost control that prevents $2,847 surprises with budgets and rate limits
Production security with authentication, encryption, and monitoring
One-command deployment that gets you running in under an hour
The platform serves 1,500+ requests daily. It's survived 8 months in production. It's processing everything from document analysis to research workflows. And it's completely open source.
Building production AI infrastructure taught me things tutorials never mention:
Technical truths:
Cost control is life support, not a nice-to-have feature
Lambda excels at tools, fails at orchestration
Streaming looks simple, implementation is brutal
Type safety prevents expensive mistakes at 3AM
Business realities:
Developers pay for great experience, abandon bad APIs
Open source builds trust better than marketing
Production numbers matter more than perfect demos
Failure stories teach more than success posts
Personal discoveries:
Building in public creates accountability
Documentation is your product's face
Shipping beats perfecting every time
Sharing mistakes helps everyone improve
You have everything you need. Real code, real examples, real production lessons. The platform is MIT licensed - use it, improve it, make money with it.
Next steps:
Star the repos - ai-platform-aws and examples
Deploy example 07 - full platform in under an hour
Build something cool - then tell me about it
Share your experience - help others learn from your journey
Get stuck? Email me at tyson@hivo.co or find me on Twitter @tysoncung.
The AI revolution needs better infrastructure. You can build it.
Go.
End of series: "Building an AI Platform on AWS from Scratch". Complete platform and examples at github.com/tysoncung/ai-platform-aws.