Skip to content

fix: add thread safety to server status management#1095

Open
Sypher845 wants to merge 2 commits intokrkn-chaos:mainfrom
Sypher845:add-thread-safety-server-status
Open

fix: add thread safety to server status management#1095
Sypher845 wants to merge 2 commits intokrkn-chaos:mainfrom
Sypher845:add-thread-safety-server-status

Conversation

@Sypher845
Copy link
Copy Markdown

@Sypher845 Sypher845 commented Jan 17, 2026

User description

Type of change

  • Refactor
  • New feature
  • Bug fix
  • Optimization

Description

Added thread safety to the HTTP server in server.py by introducing a lock mechanism. The server runs in a separate thread and handles requests that read/write a global status variable. Without proper synchronization, this could lead to race conditions if multiple requests happen at the same time.

The fix is simple: added a threading.Lock() and wrapped all access to the server_status variable with it. This ensures only one thread can read or write the status at a time, preventing any potential conflicts.

Related Tickets & Documents

Documentation

  • Is documentation needed for this update?

Checklist before requesting a review

  • I have performed a self-review of my code by running krkn and specific scenario
  • If it is a core feature, I have added thorough unit tests with above 80% coverage

PR Type

Bug fix, Enhancement


Description

  • Replaced deprecated _thread module with modern threading module

  • Added thread-safe locking mechanism for server_status variable access

  • Refactored thread creation to use threading.Thread with daemon mode

  • Added comprehensive test suite for concurrent access and thread safety


Tests after the modification

There were 24 tests before , 5 tests were added to verify thread safety improvements.

  1. test_concurrent_writes_to_server_status
  2. test_concurrent_reads_and_writes
  3. test_concurrent_handler_status_updates
  4. test_lock_prevents_partial_writes
  5. test_status_lock_is_reentrant_safe
image image

Also when we run kraken

image image

Diagram Walkthrough

flowchart LR
  A["Deprecated _thread module"] -->|Replace| B["Modern threading module"]
  C["Unprotected server_status"] -->|Add Lock| D["Thread-safe status access"]
  E["Thread creation"] -->|Modernize| F["threading.Thread with daemon"]
  G["Basic tests"] -->|Extend| H["Comprehensive thread safety tests"]
Loading

File Walkthrough

Relevant files
Bug fix
server.py
Modernize threading and add status lock protection             

server.py

  • Replaced import _thread with import threading for modern thread
    handling
  • Added status_lock = threading.Lock() to protect server_status variable
  • Wrapped all server_status reads and writes with lock context managers
  • Refactored start_server() to use threading.Thread with daemon mode
    instead of _thread.start_new_thread()
+18/-10 
Tests
test_server.py
Update tests for threading modernization and add thread safety tests

tests/test_server.py

  • Updated imports to include threading and time modules
  • Updated mocks from @patch('server._thread') to
    @patch('server.threading.Thread')
  • Refactored thread-related test assertions to match new
    threading.Thread API
  • Added new TestThreadSafety test class with 5 comprehensive concurrent
    access tests
  • Tests cover concurrent writes, concurrent reads/writes, handler
    updates, partial write prevention, and lock safety
+203/-8 

Copilot AI review requested due to automatic review settings January 17, 2026 11:31
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds thread safety to the HTTP server's status management by introducing a threading.Lock to protect access to the global server_status variable. The server runs in a separate thread and handles concurrent HTTP requests that read and write the status, making synchronization necessary to prevent race conditions.

Changes:

  • Introduced threading module and created a status_lock lock object
  • Wrapped all reads and writes to server_status with the lock using context managers
  • Protected status access in do_status(), set_run(), set_stop(), set_pause(), and publish_kraken_status()

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Collaborator

@tsebastiani tsebastiani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Sypher845 you have pending discussions with copilot, what say?

@Sypher845
Copy link
Copy Markdown
Author

Hey @tsebastiani , I have already addressed Copilot's suggestions in the latest commit, i forgot to mark resolve them in conversation.

@Sypher845 Sypher845 force-pushed the add-thread-safety-server-status branch from aff619d to 11304da Compare January 26, 2026 14:10
@Sypher845
Copy link
Copy Markdown
Author

Also solved the merge conflicts , please review.

@Sypher845 Sypher845 requested a review from tsebastiani January 26, 2026 14:11
@tsebastiani
Copy link
Copy Markdown
Collaborator

tsebastiani commented Jan 28, 2026

Could you kindly provide a screenshot showing server.py running successfully after your modifications, as well as—if available—a screenshot that demonstrates the effectiveness of those changes?

@qodo-code-review
Copy link
Copy Markdown

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🟡
🎫 #1068
🟢 Add a `threading.Lock()` to protect the global `server_status` shared between threads.
Wrap all server_status reads and writes with the lock (preferably via context managers) to
avoid race conditions.
Update `publish_kraken_status()` to use the same locking approach.
Run existing tests to ensure no regressions.
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

🔴
Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Missing exception context: The new try/except around thread startup logs a generic failure message but omits the
caught exception details, reducing debuggability and actionable context.

Referred Code
try:
    server_thread = threading.Thread(target=httpd.serve_forever, daemon=True)
    server_thread.start()
    publish_kraken_status(status)
except Exception as e:
    logging.error(
        "Failed to start the http server \
                  at http://%s:%s"

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Unauthenticated state change: The /RUN, /STOP, and /PAUSE POST endpoints mutate global server state without any visible
authentication/authorization or input validation, which may be acceptable in a trusted
environment but cannot be verified from the diff.

Referred Code
def do_POST(self):
    if self.path == "/STOP":
        self.set_stop()
    elif self.path == "/RUN":
        self.set_run()
    elif self.path == "/PAUSE":
        self.set_pause()

def set_run(self):
    global server_status
    self.send_response(200)
    self.end_headers()
    with status_lock:
        server_status = 'RUN'

def set_stop(self):
    global server_status
    self.send_response(200)
    self.end_headers()
    with status_lock:
        server_status = 'STOP'


 ... (clipped 7 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-code-review
Copy link
Copy Markdown

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Guard counter increment with lock

Prevent a race condition by protecting the increment of the shared
requests_served counter with a lock.

server.py [27]

-SimpleHTTPRequestHandler.requests_served += 1
+with status_lock:
+    SimpleHTTPRequestHandler.requests_served += 1
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: This suggestion correctly identifies and fixes a race condition on the requests_served counter, which was overlooked in the PR's thread-safety improvements, thus enhancing the code's correctness.

Medium
General
Correctly test for lock re-entrancy

Correct the test_status_lock_is_reentrant_safe test to accurately reflect its
purpose, as the current implementation does not test for lock re-entrancy.

tests/test_server.py [545-559]

-def test_status_lock_is_reentrant_safe(self):
+def test_status_lock_is_not_reentrant(self):
     """
-    Test that status_lock can be safely used in nested contexts
+    Test that status_lock is not re-entrant and will block on a second acquisition.
     """
-    server.server_status = "INITIAL"
+    lock_acquired = threading.Event()
+    test_finished = threading.Event()
 
-    # Acquire lock and modify status
-    with server.status_lock:
-        server.server_status = "LOCKED"
-        # Read it back while still holding the lock
-        status = server.server_status
-        self.assertEqual(status, "LOCKED")
+    def acquire_lock_twice():
+        with server.status_lock:
+            lock_acquired.set()
+            # This second acquisition will block, as Lock is not re-entrant
+            with server.status_lock:
+                pass
+        test_finished.set()
 
-    # Verify status persists after lock release
-    self.assertEqual(server.server_status, "LOCKED")
+    thread = threading.Thread(target=acquire_lock_twice)
+    thread.start()
 
+    # The thread should acquire the lock once and then block.
+    # We expect the test to time out here, proving it's not re-entrant.
+    acquired = lock_acquired.wait(timeout=0.1)
+    self.assertTrue(acquired, "Lock was not acquired in the first place.")
+
+    # The thread should be blocked on the second acquire, so test_finished is not set
+    self.assertFalse(test_finished.is_set(), "Lock is re-entrant, which is not expected.")
+
+    # Clean up the thread
+    # Note: In a real scenario, the thread would be deadlocked.
+    # This test structure is for demonstration; the thread will hang.
+    # For a production test suite, consider more advanced cleanup.
+    thread.join(timeout=0.1)
+    self.assertTrue(thread.is_alive(), "Thread finished, indicating re-entrancy or an issue.")
+
  • Apply / Chat
Suggestion importance[1-10]: 5

__

Why: The suggestion correctly identifies that the test test_status_lock_is_reentrant_safe is misnamed and its implementation does not test for re-entrancy, improving the accuracy and quality of the test suite.

Low
Keep server thread reference

Store the server thread reference in a global variable to allow for clean
shutdown or joining later.

server.py [70-71]

+global server_thread
 server_thread = threading.Thread(target=httpd.serve_forever, daemon=True)
 server_thread.start()
  • Apply / Chat
Suggestion importance[1-10]: 4

__

Why: This is a good practice suggestion for managing thread lifecycle, but since the thread is a daemon, the program can exit without explicitly joining it, making this a moderate-impact improvement.

Low
Switch to a reentrant lock

Replace threading.Lock with threading.RLock to allow for nested lock
acquisitions in the future.

server.py [8]

-status_lock = threading.Lock()
+status_lock = threading.RLock()
  • Apply / Chat
Suggestion importance[1-10]: 2

__

Why: The suggestion proposes a valid alternative lock type, but RLock is not necessary for the current implementation and adds complexity for a hypothetical future use case.

Low
  • More

@Sypher845
Copy link
Copy Markdown
Author

@tsebastiani I have added the screenshot of the test_server passing successfully in the description( also 5 tests were added to verify thread safety improvements).
For the effectiveness, it is not possible to provide a screenshot , but using threading.Lock() will prevent race conditions when multiple thread read/write server_status concurrently.

@Sypher845
Copy link
Copy Markdown
Author

@tsebastiani any update on this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add thread safety to server.py status management

3 participants