AWS WAFv2 to analyze docx files against XSS attacks

I have a web application that gets as input a .docx file to analyze some sections of it. The file’s size is usually around 3 MB.

I have implemented an AWS WAFv2 using Terraform to prevent XSS attacks that could happen through the insertion of a payload inside the docx file. This is actually my first time dealing with WAFv2 so there’s a lot of concepts that are still a bit unclear. I assume that the docx file is considered as the body of the request.

Below is the code of a rule within a rule group with an xss_match_statement.

rule {
name     = "xss-prevent-attack-on-body"
priority = 1

action {
  allow {}

statement {
  xss_match_statement {
    field_to_match {
      body {}

    dynamic "text_transformation" {
      for_each = var.text_transformations
      iterator = transformation
      content {
        priority = index(var.text_transformations, transformation.value)
        type     = transformation.value

visibility_config {
  cloudwatch_metrics_enabled = true
  metric_name                = "xss-prevent-attack-rule"
  sampled_requests_enabled   = true

And for the text_transformation, I have these inputs for the moment:

text_transformations = ["HTML_ENTITY_DECODE", "URL_DECODE"]

And this is attached to an ACL that is attached as well to an Application Load Balancer. I read from AWS documentation that only the first 8 KB (8192 bytes) of the request body are forwarded to AWS WAF for inspection.

I guess there’s something missing around here. From my preliminary analysis, I would say that the WAF cannot analyze the file as it is, should be some conversion to text first. I’m just guessing.

My questions are:

  • Do the file considered as a body of the request?
  • Is what I implemented so far correct from the logical aspect?
  • Is there a way to make the whole file analyzed by WAFv2?

Thank you all in advance for your efforts.


