A bit of a different post today. I recently had a series of back-to-back(ish) technical interviews, each with a unique challenge and core competency.

I've been doing this long enough that I've seen technical appraisals ranging from take-home assignments to timed, closed-book algorithm design quizzes to deep-dive situational questions about times when I've encountered workplace conflict or lead a refactor of a monolithic legacy system. Each has their own set of challenges, but what never differs is my commitment to preparation. In a troubling market, interviews seems to be few and far between, and I refuse to squander any opportunities I earn.

In this case, three different formats in rapid succession with clear themes but vague subject matter meant that I needed to take a bit of a buckshot approach with core engineering concepts so that I could be ready for anything. This approach culminated in a study guide of sorts, which I've decided to share.

This isn't a comprehensive list of anything in particular. Quite the contrary, in fact, as I generated the initial set of bullet points off the cuff as I dug into the remembered archives of my computer engineering curriculum. It is, however, a fairly decent breakdown of topics that are important to consider in system architecture, algorithm design, and full-stack monitoring and observability. Truth be told, it reminded me of studying for exams back in the day, so it occurred to me that there may be some out there who could use this sort of refresher.

Enjoy!

Competency 1: Proactive Problem Solving/Quality Code

Coding exercise.

Review

Data structures (complexity and use cases)

  • array: no inherent relationship between elements, linear, spacially efficient, contiguous in memory, O(n) access for unordered elements
  • linked list: each element connects to the next (and vice versa in doubly linked lists), easier to allocate since memory is not contiguous, expensive to find a specific element
  • hash: O(1) access, data can be lost if the hash collides
  • queue: FIFO, good for ordered events
  • stack: LIFO, good for stateful entities
  • tree: hierarchal nodes with directional relationships; every path terminates on a leaf node
  • graph: non-hierarchal nodes with unstructured relationships; paths may loop or have bi-directional node relationships
  • heap: tree-based memory structure

Algorithms (complexity and use cases): sorting, traversal, searching

Note: it's been mathematically proven that a comparison sort cannot perform better than O(nlogn) in the average case

  • Merge sort - best/worst/average O(nlogn) - memory O(n) - divide and conquer general-purpose algorithm which starts with single-element lists and then combines them to sort
  • Quicksort - best/average O(nlogn) - worst O(n²) - memory O(logn) - one of the most-used sorting algorithms, which on average is faster than merge sort and heapsort
  • Insertion sort - best O(n), worst O(n²) - memory O(1) - slow but uncomplicated, works by swapping adjacent elements
  • Heapsort - best/worst/average O(nlogn) - memory O(1) - optimized selection sort, which quickly determines the max element and then places it at the end of the list. Slower on average than an optimal quicksort, but easier to implement. Treats the set as a binary tree using arithmetic on the indices
  • Traversal - depth first — inorder: leftmost leaf node has priority, and its parent is visited before progressing to subsequent leaves — preorder: leftmost leaf node is the first target, but any node is visited prior to traversing to its children — postorder: leftmost leaf node has the priority, and all siblings must be visited before the parent is considered visited - breadth first — level order: root has priority, and child nodes of the same depth are visited before any deeper children — Create a queue, enqueue the root node, then loop while the queue is not empty by dequeueing the first element to visit it and then queuing its children
  • Searching - two pointers — useful if the input is sorted, if the problem uses a range or subarray, if the problem involves maintaining a window (which can change in size), or for linked lists - binary search — used for binary trees — O(logn) — memory O(1) - regex — efficient string searches - fibonacci search — divide and conquer on sorted arrays — O(logn) — memory O(1)
  • Shuffling - Fisher-Yates — step down through the array and shuffle each element with a random lower-order element — GOLD STANDARD - selection shuffle — form a new set by randomly selecting from remaining elements of the original set

Writing unit tests without a framework (i.e. Jasmine with regular TypeScript or TypeScript tests without Jasmine)

  • Use a framework. Barring that, you need to ensure that the functions you are testing are exposed to the harness and that context is appropriately provided
  • Unit (ideally headless) - Jasmine, Jest — Be cognizant of memory leaks
  • Integration (headless): - Jest, Mocha, Cypress, Playwright, Protractor
  • E2E: Playwright, TestCafe, Cypress, Protractor

Functional programming

Principles

  1. Functions are first-order citizens (can be assigned to variables, can be passed as arguments, and can be returned)
  2. Functions are deterministic (same input equals same output)
  3. Functions do not create side effects (cannot change data outside of itself)
  4. Data is immutable (always return a copy, never modify in-place or by reference)
  5. Declarative (describe what you want to achieve rather than how to achieve it)
  6. Use function composition (chaining)
  7. Prefer recursion over loops

Benefits

  • easier to understand at a glance
  • highly testable
  • no side-effects means easier triage and consistent behaviours
  • immutable data is better for reactive interfaces because the state will reload when the reference changes

Drawbacks

  • everything is atomic, so it can be difficult to track a whole interaction
  • typically less computationally efficient than imperative programming
  • abstractions may not be optimized in isolation

Complexity analysis

  • when determining complexity, largest term takes precedence since it scales the fastest

Test-Driven Development (TDD)

  • Write tests first, then add functionality which allows those tests to pass
  • Declarative approach: describe what you want to accomplish and then build the system that accomplishes it

Process

  1. List scenarios for the new feature
  2. Write a test for an item on the list
  3. Run all tests; new test should fail for expected reasons
  4. Write the simplest code that passes the test
  5. All tests should now pass
  6. Refactor while ensuring that all tests continue to pass
  7. Repeat for each item on the list

Competency 2: Software Architecture and System Design

Design full system to solve a practical problem (theory).

Review

Software Patterns (use cases and high-level implementation): MVC, Gang of Four

  • Model-View-Controller (MVC) - Model: data - View: interface - Controller: logic - Examples include server-side rendering, API-driven interfaces
  • Model-View-Viewmodel (MVVC) - Model: data - View: interface - Viewmodel: also known as the data transfer object; abstraction of the view which uses a binder which to automate conversion of bound, public properties in the view - This is similar to the MVP pattern, but the main difference is that the Presenter has a reference to the View while the Viewmodel does not
  • Gang of Four - Factory — Creates objects without having to specify the exact class — Useful for creating dynamic objects using an interface (i.e. not using constructors, which is good for generics) - Builder — Constructs objects in stages by decoupling construction and representation — Chains factory methods - Prototype — Creates instances by cloning an existing object - Singleton — Instantiate once, return that instance on subsequent access — Good for entities which are expensive to instantiate or utilities for which individual state is not necessary - Chain of Responsibility — Each processing object maintains a list of the types of command objects it can handle — Any unhandled objects are passed to the next processing object in the chain - CommandNeed more information, unclear how this is supposed to work - Interpreter — Translates symbolic representations of language by composing a syntax tree - Iterator — Accesses objects sequentially - Mediator — Allows loose coupling between classes by being the only entity with detailed knowledge of their functions - Memento — Allows for an object to be restored to a previous state snapshot (i.e. undo) - Observer — Subject maintains a list of dependents (observers, event sinks) which it notifies of any state changes; connection is a standard interface and subscriptions are managed internally — Tightly coupled — Subject calls observer methods directly; typically synchronously — Because the subscriptions are internally managed, can get slow for a lot of observers — Any event filtering is done internally — Exceptions in the observers can cause upstream crashes — GUI frameworks, MVC, local object notifications - Template (i.e. Abstraction) — Defines a skeleton of an algorithm as an abstract class - Visitor — Decouples an algorithm from a data structure by moving the hierarchy of methods into one object - Publish-Subscribe — Loosely-coupled — Subscribers managed their own subscriptions — Scales infinitely — Failures are isolated; the broker decouples participants

Microservices

Benefits

  • highly modular
  • separation of concerns
  • decentralized
  • agile/flexible
  • scalable
  • considered to be a viable mechanism for modernizing legacy monolithic systems by systematically replacing modules

Drawbacks

  • higher latency due to more network requests
  • simplified testing of individual services, but more complicated integration and e2e tests
  • can overcomplicate system implementations by enforcing restrictions on service size
  • increased overhead for monitoring
  • not resilient to changes in requirements
  • it is necessary to extract datasets from each service and compose them into a larger schema in order to get a true view of the network

Caching

  • saving a response so that it can be quickly reserved if another request is made with the same arguments
  • if data changes, the associated cache needs to be invalidated
  • this can be done using cache keys in filenames to indicate that the resource is different
  • cache can be manually evicted either by user decision or by using a server-side signal architecture
  • In-memory caching - Uses RAM with faster transfer speeds to store the cache - volatile and dependent on local device health
  • Distributed caching - data is saved across multiple servers/networks to allow for high availability - if cache is missed, check the persistent data store
  • Client-side caching - web-based strategy - stores data in browser to prevent the need to network requests - can lead to stale data because it's difficult to know if the data has changed without contacting the server
  • Cache-aside/lazy loading - cache only contains data that has been actively requested
  • Write-through - updates the cache in the event of a miss
  • Pre-fetch - proactively loading resources that will likely be needed - asynchronous
  • Redis - in-memory data structure store - also used for session management and real-time analytics - can handle complex data structures and data persistence
  • Memcached - high-performance, lightweight key/value store - optimized for speed and memory efficiency - good for database calls
  • Amazon ElastiCache - managed service, cloud-based
  • Content Delivery Networks (CDN) - global distribution of static assets - e.g. Imgur, Amazon Cloudfront
  • Nginx - server-side caching to prevent recalculation of HTML
  • CPU cache - L1/2/3 - super fast, low storage - used to store frequently-used instructions at the hardware level
  • Disk Cache - Frequently-accessed data from the hard disk may be loaded in RAM for faster access
  • Load balancing - distributing requests across parallel servers to ensure no one server gets overloaded - allows for individual servers/machines to be cycled without disruption of services - may lead to inconsistency since servers need to be upgraded simultaneously

Fault tolerance schemes

  • graceful degradation - creating a fallback in case a specific feature or service is unsupported
  • redundancy - allocating more resources than needed; parallel copies of servers or data that can be accessed in case of failure
  • RAID 0 (striped) - evenly splitting data across multiple systems - no backups, so a failure leads to total loss of that set of data, but preserves the other sets
  • RAID 1 (mirrored) - exact copy of the data on multiple devices - n times the storage required, but device failure doesn't lead to data loss
  • RAID 2 - uses parity bits to split data at the bit level across multiple devices - can allow to data restoration using Hamming codes and parity bits - with modern devices, the benefits of this method are limited compared to other redundancy mechanisms
  • RAID 3 - byte-level striping and a dedicated parity disk - rarely used in practice - great for large-scale I/O with a fast transfer rate, but gets bad performance for small reads and writes - requires synchronized spindles, so the hardware overhead diminishes its usefulness - replaced by RAID 5
  • RAID 4 - block-level striping and a dedicated parity disk - good performance of random reads, but poor performance of random writes - replaced by RAID 5
  • RAID 5 - block-level striping with distributed parity - can restore data using parity in the case of a single drive failure, but if multiple drives fail then the data is lost - requires at least 3 disks
  • RAID 6 - as RAID 5, but with an extra parity block for added resilience - requires more processing power for both reads and writes - complex parity is only required if there's a disagreement
  • horizontal scaling - in distributed systems, additional resources can be allocated if the allotted resources are running low or if high traffic is anticipated
  • vertical scaling - adds more compute power to individual servers instead of adding additional servers

Storage (uses cases, speed characteristics, and optimization methods): database technologies, document-based storage, memory, registers, physical media, hardware caches, variable sizes

Database technologies

— > Atomicity, Consistency, Isolation, Durability (ACID)

MariaDB

  • open source
  • broke off from MySQL

Advantages

  • up to twice as fast as MySQL
  • secure
  • free
  • innovative featureset

Disadvantages

  • limited vendor support
  • fractured releases

MongoDB

  • open source
  • unstructured data

Advantages

  • supports NoSQL
  • highly flexible
  • excellent read speed

Disadvantages

  • difficult setup
  • doesn't support SQL
  • lack of security (by default)
  • poor analytics

MySQL

  • partially* open source
  • managed by Oracle

Advantages

  • free (though paid versions exist for enterprise)
  • highly compatible

Disadvantages

  • sparse featureset
  • limited support

Oracle

  • ideal for enterprise

Advantages

  • robust
  • first-party support

Disadvantages

  • expensive; both in money and in required processing power

PostgreSQL

  • open source

Advantages

  • unlimited scaling
  • free
  • works with relational and non-relational datasets
  • ACID-compliant
  • native JSON support

Disadvantages

  • lack of documentation
  • slow read-only operations

SQLite

  • not meant for client/server interactions
  • ideal for embedded systems, IOT

Advantages

  • incredibly lightweight
  • fast on-disk storage model
  • can be used as a caching mechanism
  • built-in data analysis tooling

Disadvantages

  • not ideal for high traffic
  • limited concurrency
  • database size limited to 281 TB

SQL Server

  • suitable for mainframes and clouds

Advantages

  • visualization and metrics
  • fast
  • integration with Microsoft family of services

Disadvantages

  • highly expensive

NoSQL

  • e.g. document stores like Elasticsearch

Advantages

  • highly scalable
  • can store data in various formats in the same store
  • easy-to-update schemas
  • JSON queries
  • especially fast read operations

Disadvantages

  • doesn't support SQL
  • not ACID-compliant
  • lack of standardization
  • no support for JOINs
  • updates require complete replacement of the record
  • data integrity requires manual effort since consistency isn't enforced
  • embedded data can lead to redundancy

Distributed systems: communication protocols (TCP, UDP, ICMP, IPv4/6, ethernet)

  • Transmission Control Protocol (TCP) - retransmits lost packets to prevent data loss - must establish a connection on both ends via handshake - suitable for high-reliability systems where speed is less important - data is guaranteed to be ordered
  • Universal Datagram Protocol (UDP) - high speed, low latency - connectionless - lightweight (8 byte) header - no reliability or ordering guarantee - provides basic integrity checks via checksums - useful when speed is more important than accuracy; expect lost packets - vulnerable to DDoS
  • Internet Control Message Protocol (ICMP) - primarily used for diagnostics (e.g. ping and traceroute) - does not transmit user data - vulnerable to DDoS
  • File Transfer Protocol (FTP) - used to transfer large files between client and server - outdated, but still commonly used for massive files that other formats don't support
  • Hypertext Transfer Protocol (HTTP/HTTPS) - used for web
  • Simple Mail Transfer Protocol (SMTP) - primarily used for email

Cybersecurity: Network layers, firewalls, PII, encryption (RSA, AES, ChaCha20, ECC), secure hashing algorithms (SHA-256, SHA-512, SHA-3, MD5), MFA, OTP, HTTP vs HTTPS, NIST cybersecurity framework

  • ** REVIEW INCOMPLETE **

Competency 3: Operational and Engineering Excellence

Questions about proactively using metrics to measure and optimize performance.

Review

CI/CD: pipeline steps and how to quantify performance at each, GitHub actions, Docker, Kubernetes, GitLab, Jenkins, Terraform

Docker

  • containerization technology
  • allows for isolated and prepackaged environments with preset rules
  • can stand up alone or can be networked

Use cases

  • local development environments
  • testing environments
  • deployment packages

Kubernetes (K8s)

  • container management platform which orchestrates container runtime systems over clusters of networked resources
  • supports Docker and other technologies
  • open source
  • automation via kubectl
  • manages infrastructure abstraction
  • monitors and allows discovery of managed services
  • Platform as a Service (PaaS)

GitHub/GitLab pipelines

  • can set up pre- and post-deployment steps
  • can be used for running automated tests, controlling merges, data validation, and automated deployment

Jenkins

  • extensible automation server

Advantages

  • can manage CI/CD via distributed networks
  • massively community support and lots of available plugins - Note: because the plugins are maintained by the community, a lot of them are redundant or no longer maintained
  • free, open source

Disadvantages

  • complex setup, and requires consistent maintenance
  • dated UI/UX
  • it's popularity makes it a target for cyber attacks, so it requires frequent security updates - Note: using another service just because it's attacked less commonly is a bit of a cop out; it's like security through obfuscation

TerraForm

  • infrastructure coding tool
  • handles everything from compute instances and networking to DNS management and SaaS features
  • works alongside other common infrastructure tools (e.g. Docker, AWS, Google Cloud, Oracle Cloud, Azure, HCP)
  • Infrastructure as Code (IaC)

Azure

  • open-source cloud computing platform
  • offers VMs, DBs, storage management, networking, and AI/ML

Advantages

  • operates globally
  • cost-effective scaling
  • first-party support
  • strong security and compliance

Disadvantages

  • cost management is complex due to the scale of the platform
  • steep learning curve
  • risk of getting locked into the Microsoft ecosystem
  • high floor for maintenence overhead
  • Customer insights: analyzing pain points, A/B testing, adoption/conversion rates, app ratings and reviews, UX analysis, accessibility
  • Device metrics and user agents

Troubleshooting and triage strategies

  • console statements - still incredibly useful for determining when or if code executes - shows order of events in asynchronous processes
  • stack trace - errors show the path of execution, which can help determine where the error originated - can easily get muddled in complex systems, as the error can be thrown by a proxy or handler due to an issue that originated before the trace
  • step-through debugging - breakpoints can give a state snapshot - useful to see the order in which functions are called - struggles with asynchronous processes because you can step into the lifecycle loop - prior analysis is typically required to determine where to look
  • styling hacks - changing CSS styles to something ridiculous for certain containers can let you debug UI elements (i.e. outlandish border or background colour)
  • browser dev tools - allow you to inspect elements, get state snapshots, run console commands, and see network traffic - doesn't provide any insights about how the application reached its current state

Monitoring Tools

  • sentry, raygun, rollbar, logrocket
  • ** REVIEW INCOMPLETE **

Performance Auditing Tools

  • Lighthouse, WebPageTest, DebugBear, SpeedCurve

Troubleshooting network requests with Postman

  • allows to test real or mocked APIs with fixed data and predictable conditions
  • can easily switch environments to do real tasks in production when the UI is broken

Observability vs Monitoring

Monitoring

  • collects data to confirm that the system is operating as expected
  • generates reports and logs, may generate alerts when a fault is encountered
  • prioritizes UX metrics
  • "when" and "what"
  • proactive error-catching

Observability

  • allows for deeper investigation into anomalies
  • focuses on measuring metrics to see if they have an effect on the system
  • useful for determining causality
  • "why" and "how"
  • retrospective error analysis

Monitoring software

Application Performance Management (APM)

AppDynamics

Advantages

  • friendly, customizable UI
  • good resources for server monitoring
  • enables process and transaction tracing
  • uses ML to set baselines for alarms

Disadvantages

  • some tools are built in Flash, which is no longer supported
  • no published pricing

Amazon CloudWatch

Advantages

  • has a free tier for basic users, and after that pricing is reasonable
  • flexible and customizable

Disadvantages

  • can only be used with the AWS ecosystem
  • complex UI
  • no transaction tracing
  • requires custom setup for memory management metrics

DataDog

Advantages

  • powerful and configurable UI
  • easy onboarding

Disadvantages

  • more for monitoring than observability: good at detection, bad at tracing
  • complex log analytics
  • charges for log retention, up to 30 days without a custom pricing plan
  • this becomes more problematic with scale

DynaTrace

Advantages

  • easy to install and configure
  • offers a free version for up to 5 servers with no limits on data retention
  • limited to 100k daily visits
  • great at transaction and process tracing
  • detects changes to topology in real time

Disadvantages

  • complex UI with a steep learning curve
  • after the free tier, pricing is comparable to Newrelic

Grafana

Advantages

  • free, open source
  • on-prem hosting available for enterprise plans
  • versatile data visualization
  • provides unified view for multiple data sources
  • customizable dashboards

Disadvantages

  • no customer support on free tier
  • steep learning curve
  • integration with a data source which isn't natively supported is difficult
  • resource intensive
  • basic alerts compared to other platforms
  • managing permissions can be difficult
  • not optimized for static reports

Kibana

  • part of the Elastic Stack (ELK -> Elasticsearch — Beats — Logstash — Kibana)
  • open source

Advantages

  • tight Elasticsearch integration
  • robust visualization, alerting, and security features
  • optimized for log filtering using custom Kibana QL
  • no added cost for scaling

Disadvantages

  • must have exactly the same version as ES nodes, so ugrades require careful planning
  • not suitable for other data sources
  • steep learning curve
  • performance issues with large, unoptimized datasets

NOTE: Grafana is generally preferred over Kibana, but Kibana has far superior tools for deep dives into complex logs

Newrelic

Advantages

  • friendly UI
  • good at data correlation
  • enables transaction and process tracing
  • good community support due to its widespread popularity

Disadvantages

  • SaaS-only (no on-premise solution)
  • only stores data for 8 days (averages are provided after this)
  • can be expensive
  • Lite version gives limited data and only retains it for 2 hours

Splunk

Advantages

  • excellent at ingesting large datasets in real time
  • strong security and compliance
  • powerful visualization tools
  • flexible and scalable
  • rich community support and good documentation

Disadvantages

  • high cost
  • steep learning curve
  • can struggle with performance at scale if data is unoptimized
  • complex maintenance
  • outdated UI

Linting: ESLint, TSLint, Biome, Prettier

  • ** REVIEW INCOMPLETE **

Theming

  • ** REVIEW INCOMPLETE **