Protect Amazon Connect from viruses and malware by scanning attachments

Michael Wittig – 21 Nov 2023

Four years ago, we stumbled into Amazon Connect. In essence, Amazon Connect allows your users to reach your organization represented by agents via phone or chat. While chatting, Amazon Connect allows users and agents to upload attachments. For many years, there was no good solution to ensure those files were malware-free. Given that anonymous users can start Amazon Connect chats, that’s quite scary. Lucky us, Amazon Connect just released a feature that enables scanning of attachments for malware. You might think: “Great, Amazon Connect scans all files from now on”. But no, Amazon Connect enables you to scan the attachments yourself. In other words, Amazon Connect invokes a Lambda function that you run to check if a file is clean or infected. If your Lambda function marks the file as clean, it will become visible to the other party. In the following blog post, I share what I learned while integrating our product bucketAV - Antivirus protection for Amazon S3 with Amazon Connect to scan files for malware.

Amazon Connect attachment scanning

The following figure shows an interaction between a user and an agent. The user (on the left) uploads a clean file first, followed by a virus. The agent (on the right) only receives the clean file. The infected file gets blocked.

Amazon Connect user uploading attachments in a chat

In the following, I share my code with you.

bucketAV customers can skip the rest of the blog post and use our Amazon Connect integration to scan attachments as documented here.

The contract

Amazon Connect calls the Lambda function with the following payload in a synchronous invocation:

{
"Version": "1.0",
"InstanceId": "ca6f678d-42f6-4b25-b6ad-2e74db01117a",
"File": {
"FileId": "76d46e1d-6720-4705-98be-fbc2ba034638",
"FileCreationTime": 1700126816287,
"FileName": "cloudonaut.png",
"FileSizeInBytes": 2877697,
"FileLocation": {
"S3Location": {
"Key": "connect/bucketav/Attachments/chat/2023/11/16/ecd01b13-6c44-4432-a8da-0c7a9149b79a_76d46e1d-6720-4705-98be-fbc2ba034638_20231116T09:26_UTC.png",
"Bucket": "amazon-connect-0af5e3328eab",
"Arn": "arn:aws:s3:::amazon-connect-0af5e3328eab/connect/bucketav/Attachments/chat/2023/11/16/ecd01b13-6c44-4432-a8da-0c7a9149b79a_76d46e1d-6720-4705-98be-fbc2ba034638_20231116T09:26_UTC.png"
}
}
}
}

The function is expected to return the following for clean files within 60 seconds:

{
"Status": "APPROVED"
}

For infected files:


{
"Status": "REJECTED"
}

Amazon Connect retries up to 3 times when a synchronous invocation returns an error. The maximum file size for an attachment to a case or a chat is 20MB.

The implementation

bucketAV scans files uploaded to Amazon S3 asynchronously. A scan job is submitted to an SQS queue, and bucketAV publishes the result to an SNS topic. Unfortunately, Amazon Connect follows a synchronous model. Therefore, I fall back to good old busy waiting to integrate bucketAV’s asynchronous model with Amazon Connect’s synchronous approach.

The first Lambda function is connected to the SNS topic where bucketAV publishes scan results. When a scan result is published, the Lambda function stores it in a DynamoDB table. The environment variable TABLE_NAME is used to pass the name of the DynamoDB table that stores scan results.

// connect-subscription.js
import { DynamoDB } from '@aws-sdk/client-dynamodb';

const TRACE_TTL_IN_SECONDS = 120;

const dynamodb = new DynamoDB({apiVersion: '2012-08-10'});

export async function handler(event) {
console.log(`Invoke: ${JSON.stringify(event)}`);
const now = Date.now();
await Promise.all(event.Records.map(record => dynamodb.putItem({
TableName: process.env.TABLE_NAME,
Item: {
id: {S: record.Sns.MessageAttributes.trace_id.Value},
status: {S: record.Sns.MessageAttributes.status.Value},
ttl: {N: (now/1000+TRACE_TTL_IN_SECONDS).toFixed(0)}
}
})));
return true;
}

The second Lambda function is the one invoked by Amazon Connect. When an attachment is uploaded, the Lambda function:

  • Submits a scan job to the SQS queue.
  • Queries the DynamoDB table every second until a scan result is available (remember, the first Lambda function stores the scan result).
  • Returns APPROVED for clean files or REJECTED for infected files as Amazon Connect expects.
    The environment variable TABLE_NAME is the same as for the first Lambda function. SCAN_QUEUE_URL references the URL of the SQS queue that enqueues scan jobs. I also use STACK_NAME to reference the CloudFormation stack name that deploys the Lambda function.
// connect.js
import { SQS } from '@aws-sdk/client-sqs';
import { DynamoDB } from '@aws-sdk/client-dynamodb';

const POLL_TIMEOUT_IN_MILLISECONDS = 1000*55;

const sqs = new SQS({apiVersion: '2012-11-05'});
const dynamodb = new DynamoDB({apiVersion: '2012-08-10'});

async function wait(ms) {
return new Promise((resolve) => setTimeout(resolve, ms));
}

export async function handler(event) {
console.log(`Invoke: ${JSON.stringify(event)}`);
const start = Date.now();
const traceId = `bucketav:connect:${process.env.STACK_NAME}:${event.File.FileId}`;
const object = {
bucket: event.File.FileLocation.S3Location.Bucket,
key: event.File.FileLocation.S3Location.Key,
trace_id: traceId
};
await sqs.sendMessage({
MessageBody: JSON.stringify({
objects: [object]
}),
QueueUrl: process.env.SCAN_QUEUE_URL
});
while((Date.now()-start) < POLL_TIMEOUT_IN_MILLISECONDS) {
const {Item: item} = await dynamodb.getItem({
TableName: process.env.TABLE_NAME,
Key: {
id: {S: traceId}
}
});
if (item === undefined) {
await wait(1000);
continue;
} else if (item.status.S === 'clean') {
return {
Status: 'APPROVED'
};
} else if (item.status.S === 'infected' || item.status.S === 'no') {
return {
Status: 'REJECTED'
};
} else {
throw new Error(`unexpected status: ${item.status.S}`);
}
}
throw new Error('timeout');
}

The following AWS infrastructure is required:

  • DynamoDB Table.
  • IAM roles for both Lambda functions.
  • Lambda functions.
  • SNS subscription for the first Lambda function.
  • Permission for the first Lambda function to allow SNS to invoke the function.
    The following code is used to deploy the application with the AWS CDK:
const { CfnParameter, CfnCondition, CfnOutput, Fn, Aws } = require('aws-cdk-lib');
const dynamodb = require('aws-cdk-lib/aws-dynamodb');
const cloudwatch = require('aws-cdk-lib/aws-cloudwatch');
const sns = require('aws-cdk-lib/aws-sns');
const lambda = require('aws-cdk-lib/aws-lambda');
const iam = require('aws-cdk-lib/aws-iam');
const logs = require('aws-cdk-lib/aws-logs');
const esbuild = require('esbuild');

// Helper function to format Lambda function source code in a way CloudFormation can deploy (inline code)
function zipFile(lambdaFile, target) {
return esbuild.buildSync({
entryPoints: [lambdaFile],
external: ['@aws-sdk/*'],
target: [target],
platform: 'node',
bundle: true,
write: false
}).outputFiles[0].text;
}

const bucketAVStackName = new CfnParameter(this, 'BucketAVStackName', {
description: 'CloudFormation stack name of bucketAV (if you followed our docs, the name is bucketav)',
type: 'String'
});

const table = new dynamodb.CfnTable(this, 'Table', {
attributeDefinitions: [{
attributeName: 'id',
attributeType: 'S'
}],
billingMode: 'PAY_PER_REQUEST',
keySchema: [{
attributeName: 'id',
keyType: 'HASH'
}],
sseSpecification: {
sseEnabled: true
},
timeToLiveSpecification: {
attributeName: 'ttl',
enabled: true
}
});

const subscriptionLambdaRole = new iam.CfnRole(this, 'SubscriptionLambdaRole', {
assumeRolePolicyDocument: {
Version: '2012-10-17',
Statement: [{
Effect: 'Allow',
Principal: {
Service: 'lambda.amazonaws.com'
},
Action: 'sts:AssumeRole'
}]
},
policies: [{
policyName: 'lambda',
policyDocument: {
Statement: [{
Effect: 'Allow',
Action: 'dynamodb:PutItem',
Resource: table.attrArn
}]
}
}]
});

const subscriptionLambdaFunction = new lambda.CfnFunction(this, 'SubscriptionLambdaFunction', {
code: {
zipFile: zipFile('connect-subscription.js', 'node18')
},
environment: {
variables: {
TABLE_NAME: table.ref
}
},
handler: 'index.handler',
memorySize: 1769,
role: subscriptionLambdaRole.attrArn,
runtime: 'nodejs18.x',
timeout: 60
});

const subscriptionLambdaPermission = new lambda.CfnPermission(this, 'SubscriptionLambdaPermission', {
action: 'lambda:InvokeFunction',
functionName: subscriptionLambdaFunction.ref,
principal: 'sns.amazonaws.com',
sourceArn: Fn.importValue(`${bucketAVStackName.valueAsString}-FindingsTopicArn`)
});

const subscriptionLambdaLogGroup = new logs.CfnLogGroup(this, 'SubscriptionLambdaLogGroup', {
logGroupName: `/aws/lambda/${subscriptionLambdaFunction.ref}`,
retentionInDays: 14
});

const subscriptionLambdaPolicy = new iam.CfnPolicy(this, 'SubscriptionLambdaPolicy', {
roles: [
subscriptionLambdaRole.ref
],
policyName: 'logs',
policyDocument: {
Statement: [{
Effect: 'Allow',
Action: [
'logs:CreateLogStream',
'logs:PutLogEvents'
],
Resource: subscriptionLambdaLogGroup.attrArn
}]
}
});

const subscription = new sns.CfnSubscription(this, 'Subscription', {
endpoint: subscriptionLambdaFunction.attrArn,
filterPolicy: {
trace_id: [{prefix: `bucketav:connect:${Aws.STACK_NAME}:`}]
},
protocol: 'lambda',
topicArn: Fn.importValue(`${bucketAVStackName.valueAsString}-FindingsTopicArn`)
});
subscription.addDependency(subscriptionLambdaPermission);
subscription.addDependency(subscriptionLambdaPolicy);

const connectLambdaRole = new iam.CfnRole(this, 'ConnectLambdaRole', {
assumeRolePolicyDocument: {
Version: '2012-10-17',
Statement: [{
Effect: 'Allow',
Principal: {
Service: 'lambda.amazonaws.com'
},
Action: 'sts:AssumeRole'
}]
},
policies: [{
policyName: 'lambda',
policyDocument: {
Statement: [{
Effect: 'Allow',
Action: 'sqs:SendMessage',
Resource: Fn.importValue(`${bucketAVStackName.valueAsString}-ScanQueueArn`)
}, {
Effect: 'Allow',
Action: 'dynamodb:GetItem',
Resource: table.attrArn
}]
}
}]
});

const connectLambdaFunction = new lambda.CfnFunction(this, 'ConnectLambdaFunction', {
code: {
zipFile: zipFile('connect.js', 'node18')
},
environment: {
variables: {
TABLE_NAME: table.ref,
STACK_NAME: Aws.STACK_NAME,
SCAN_QUEUE_URL: Fn.importValue(`${bucketAVStackName.valueAsString}-ScanQueueUrl`)
}
},
handler: 'index.handler',
memorySize: 1769,
role: connectLambdaRole.attrArn,
runtime: 'nodejs18.x',
timeout: 60 // Maximum timeout for an attachment scanner: 60 seconds
});

const connectLambdaLogGroup = new logs.CfnLogGroup(this, 'ConnectLambdaLogGroup', {
logGroupName: `/aws/lambda/${connectLambdaFunction.ref}`,
retentionInDays: 14
});

new iam.CfnPolicy(this, 'ConnectLambdaPolicy', {
roles: [
connectLambdaRole.ref
],
policyName: 'logs',
policyDocument: {
Statement: [{
Effect: 'Allow',
Action: [
'logs:CreateLogStream',
'logs:PutLogEvents'
],
Resource: connectLambdaLogGroup.attrArn
}]
}
});

// Amazon Connects adds permissions automatically when enabling attachment scanning
// new lambda.CfnPermission(this, 'ConnectLambdaPermission', {
// action: 'lambda:InvokeFunction',
// functionName: connectLambdaFunction.ref,
// principal: 'connect.amazonaws.com',
// sourceArn: `arn:aws:connect:${Aws.REGION}:${Aws.ACCOUNT_ID}:instance/INSTANCE_ID`
// });

That’s it. I hope my blog post helps you to integrate your antivirus solution with Amazon Connect. If you don’t have an existing antivirus solution for AWS, check out bucketAV and the new Amazon Connect attachment scanning integration.

Michael Wittig

Michael Wittig

I’ve been building on AWS since 2012 together with my brother Andreas. We are sharing our insights into all things AWS on cloudonaut and have written the book AWS in Action. Besides that, we’re currently working on bucketAV, HyperEnv for GitHub Actions, and marbot.

Here are the contact options for feedback and questions.