If you've ever built automations against Atlassian's REST APIs, you know that navigating their documentation can sometimes feel like a boss battle. Recently, I encountered a critical issue where our automated Start of Day (SOD) and End of Day (EOD) reporting scripts suddenly stopped working.

If you have a Python script, CI/CD pipeline, or automation bot that relies on exporting Confluence pages as PDFs, and it suddenly started returning authentication or security errors, this guide is for you.

The Problem

Our automation was simple: hit the Confluence API, grab the page, export it as a PDF using the pdfpageexport.action endpoint, and distribute it to the team.

Out of nowhere, the script started failing. After digging through the logs and testing the endpoints, the issue became clear: the export URL was now strictly requiring a parameter called atl_token.

When looking into the Atlassian documentation, you might see references to adding the X-Atlassian-Token: no-check header to bypass XSRF (Cross-Site Request Forgery) checks. For standard REST API calls, this works perfectly. But PDF exports in Confluence rely on older Struts actions (web-work actions). These endpoints act like traditional browser actions and validate the atl_token before the logic even checks for your bypass headers.

The Trap

When you hit a wall like this, we as a developer instinct is to open the browser's Network tab, manually click "Export to PDF", grab the atl_token you see in the URL, and paste it into your script.

I admit, I brute-forced my way into this temporary fix. But there is a massive catch: The atl_token is a session-bound XSRF token.

It is cryptographically linked to your specific active browser session (JSESSIONID). If you hardcode this token:

  1. It breaks when you log out: The moment your session expires, the token dies, and your script fails again.
  2. It's restricted to you: If you share the script with the Finance or HR team so they can run it with their Service Accounts, the server will reject it because the token doesn't match their session credentials.

The Solution

To make the automation robust, scalable, and team-agnostic, we need to stop treating the atl_token as a static credential and start treating it as a dynamic session variable.

The fix involves refactoring your script to mimic how a browser behaves:

  1. Establish a persistent HTTP session.
  2. Pre-fetch the target Confluence page.
  3. Scrape the HTML for the dynamically generated atl_token.
  4. Inject that fresh token into the PDF export URL

The Implementation

Here is how you can implement this robust fix using Python and the requests library.

Step 1: Set up a persistent session Instead of making isolated requests, use requests.Session(). This ensures that the authentication cookies provided when you connect are maintained throughout the script's execution.

Step 2: Write a token-scraping method Confluence embeds this token directly into the HTML of the page, usually within a <meta> tag or a hidden form input. We can fetch the page and use Regex to extract it.

class ConfluenceConnector:
    def __init__(self, domain: str, username: str, api_token: str):
        self.api_domain = f"https://{domain}"
      
        auth_string = f"{username}:{api_token}"
        auth_token = base64.b64encode(auth_string.encode()).decode()
    
      
        self.session = requests.Session()
        self.session.headers.update({
            "Authorization": f"Basic {auth_token}",
            "X-Atlassian-Token": "no-check", 
        })
    def _fetch_atl_token(self, page_id: str) -> str:
        """Navigates to the page and scrapes the session-specific XSRF token."""
        url = f"{self.api_domain}/wiki/pages/viewpage.action?pageId={page_id}"
        response = self.session.get(url)
        response.raise_for_status()
      
        meta_match = re.search(r'<meta\s+name="ajs-atl-token"\s+content="([^"]+)">', response.text, re.IGNORECASE)
        if meta_match:
            return meta_match.group(1)
     
        input_match = re.search(r'<input[^>]+name="atl_token"[^>]+value="([^"]+)"', response.text, re.IGNORECASE)
        if input_match:
            return input_match.group(1)
        raise Exception("Could not find atl_token in HTML. Check authentication.")
    def request_pdf_export(self, page_id: str) -> requests.Response:
        """Triggers the PDF export using the dynamically generated token."""
        atl_token = self._fetch_atl_token(page_id)
        pdf_download_url = (
            f"{self.api_domain}/wiki/spaces/flyingpdf/pdfpageexport.action"
            f"?pageId={page_id}&atl_token={atl_token}"
        )
        response = self.session.get(pdf_download_url, allow_redirects=False)
        response.raise_for_status()
        
        return response

Wrapping Up

APIs change, and legacy endpoints acting as web-actions are particularly notorious for silent security updates. By shifting from a static token approach to a dynamic, session-aware HTML scraping approach, your automation can survive these XSRF protection updates.

Now, whether the script is run by your developer account, a Finance team member, or a CI/CD service account bot, the atl_token will always be fresh, valid, and successfully authorized.

References & Useful Links

If you want to dive deeper into how Atlassian handles XSRF tokens and session management, these resources are incredibly helpful:

  • Community Discussion: Can I get the "atl_token"? — The original thread discussing the challenges of extracting this token programmatically.
  • Official Developer Docs: Form Token Handling / XSRF Protection — Atlassian's official documentation on how XSRF protection works within Confluence plugins and webwork actions.
  • REST API Guidelines: Confluence Server REST API — Broad documentation covering standard REST authentication and when to use the X-Atlassian-Token: no-check header.