CS Concepts

CS Concepts

本章笔记着重对CS相关专业科目的名词进行解释,在修考问答题和面试八股文场景下适用。这些名词都来源于CS相关学科。本篇笔记内容按科目划分,每个科目下面有对应的常见高频名词解释(中/英)。

1. Data Stucture & Algorithms

Hash Table

A Hash Table is a data structure that maps keys to values using a hash function. It stores data in an array-like structure, where each key is hashed to an index. Collisions (when different keys map to the same index) are handled using techniques like chaining or open addressing. Hash tables provide average O(1) time complexity for insertion, deletion, and lookup. They are widely used in databases, caching, and symbol tables.
Hash Table


Hash Collision / Hash Clash

A hash collision (hash clash) occurs when two different inputs produce the same hash value in a hash function. Since hash functions map a large input space to a smaller output space, collisions are inevitable due to the pigeonhole principle. Collisions can weaken security in cryptographic hashes (e.g., MD5, SHA-1) and reduce efficiency in hash tables. Techniques like chaining, open addressing, and better hash functions help mitigate collisions. Stronger cryptographic hashes (e.g., SHA-256) minimize the risk of intentional hash clashes (collision attacks).


Open Addressing / Closed Hashing

Open addressing is a collision resolution technique in hash tables where all elements are stored directly in the table without external chaining. When a collision occurs, the algorithm searches for the next available slot using a probing sequence (e.g., linear probing, quadratic probing, or double hashing). Linear probing checks the next slot sequentially, quadratic probing uses a quadratic function to find slots, and double hashing applies a second hash function for probing. Open addressing avoids linked lists, reducing memory overhead but may suffer from clustering. It works best when the load factor is kept low to maintain efficient lookups.


Seperate Chaining

Separate chaining is a collision resolution technique in hash tables where each bucket stores multiple values using a linked list (or another data structure like a BST). When a collision occurs, the new element is simply added to the linked list at that index. This method allows the table to handle an unlimited number of collisions but increases memory usage. Performance depends on the length of the chains; with a well-distributed hash function, the average lookup time remains O(1) in best case and O(n) in worst case. Rehashing or using a larger table can help maintain efficiency.

Hash Table

P ≠ NP

P ≠ NP means that not all problems whose solutions can be verified quickly (in polynomial time) can also be solved quickly.
P is the class of problems that can be solved in polynomial time.
NP is the class of problems whose solutions can be verified in polynomial time.
The question is: if a problem’s solution can be verified quickly, can it also be found quickly?
Most experts believe P ≠ NP, but it has not been proven yet.

Hash Table

NP Hard Problems

NP-Hard problems are computational problems that are at least as difficult as the hardest problems in NP (nondeterministic polynomial time). Solving an NP-Hard problem quickly would mean we could solve every NP problem quickly too, but no such efficient solution is known. These problems may not even have verifiable solutions in polynomial time. They often involve optimization or decision-making with many possibilities, like scheduling, routing, or packing.Examples include the Traveling Salesman Problem and Knapsack Problem.


NP-Complete Problems

NP-Complete problems are a special class of problems that are both:

  1. In NP – their solutions can be verified in polynomial time, and
  2. NP-Hard – as hard as the hardest problems in NP.

If you can solve any NP-Complete problem quickly (in polynomial time), you can solve all NP problems quickly. Famous examples include the Boolean Satisfiability Problem (SAT) and Traveling Salesman Problem (decision version).


The Travelling Salesman Problem

The Travelling Salesman Problem (TSP) asks for the shortest possible route that visits each city once and returns to the starting city.
It is a classic combinatorial optimization problem in computer science and operations research.
TSP is NP-hard, meaning there’s no known efficient algorithm to solve all cases quickly.
Exact solutions use methods like brute force, dynamic programming, or branch and bound.
Approximation and heuristic algorithms (e.g. genetic algorithms, simulated annealing) are used for large instances.

TSP

The Knapsack Problem

The Knapsack Problem asks how to choose items with given weights and values to maximize total value without exceeding a weight limit.
Each item can be either included or excluded (0-1 Knapsack).
It’s a classic NP-hard optimization problem.
Dynamic programming is commonly used for exact solutions.
Greedy or approximation methods are used for large instances.


The SAT (Boolean Satisfiability) problem

The SAT (Boolean Satisfiability) problem asks whether there exists an assignment of true/false values to variables that makes a Boolean formula true.
It’s the first problem proven to be NP-complete.
The formula is usually given in CNF (Conjunctive Normal Form).
SAT solvers use techniques like backtracking and clause learning.
Many real-world problems (e.g. planning, verification) can be reduced to SAT.


Divide and Conquer

Divide and Conquer is a problem-solving strategy that works in three main steps:

  1. Divide the problem into smaller sub-problems of the same type.
  2. Conquer each sub-problem by solving them recursively.
  3. Combine the solutions of sub-problems to get the final result.

Classic examples include Merge Sort, Quick Sort, and Binary Search.


Branch and Bound

Branch and Bound is an algorithmic paradigm for solving combinatorial optimization problems efficiently. It systematically divides the solution space into smaller subproblems (branching) and computes bounds to eliminate unpromising branches early (pruning). The method maintains an upper bound from feasible solutions and a lower bound from relaxed problems, allowing it to discard branches that cannot contain the optimal solution. This approach significantly reduces the search space compared to exhaustive enumeration, making it effective for problems like integer programming, traveling salesman, and knapsack problems.

Branch and Bound

B-tree

A B-tree is a self-balancing search tree that maintains sorted data for efficient insertion, deletion, and search in O(log n) time.
It allows each node to have multiple keys and children, making it wider and shallower than binary trees.
All leaf nodes are at the same level, ensuring the tree stays balanced.
It’s widely used in databases and file systems for fast disk-based access.

Hash Table

Heap Sort

Heap Sort is a comparison-based sorting algorithm that uses a binary heap data structure. It works in two main steps:

  1. Build a Max Heap from the input data so that the largest element is at the root.
  2. Extract the root (maximum element), swap it with the last item, reduce the heap size, and heapify the root to maintain the max heap. Repeat until the heap is empty.

Time Complexity: Best, Average, Worst: O(n log n)

Key Characteristics:

  • In-place sorting (no extra space needed)
  • Not stable
  • Good for scenarios where memory is limited and performance is important.

Merge Sort

Merge Sort is a divide and conquer sorting algorithm that divides the input array into smaller parts, sorts them, and then merges the sorted parts.

Steps:

  1. Divide the array into two halves.
  2. Recursively sort each half.
  3. Merge the two sorted halves into one sorted array.

Time Complexity:

  • Best, Average, Worst: O(n log n)

Key Characteristics:

  • Stable sort
  • Not in-place (uses extra space for merging)
  • Great for sorting linked lists or large datasets with consistent performance.

Quick Sort

Quick Sort is a divide and conquer sorting algorithm that works by selecting a pivot element and partitioning the array into two parts:

Steps:

  1. Choose a pivot (e.g., first, last, or random element).
  2. Partition the array: elements less than pivot go left, greater go right.
  3. Recursively apply Quick Sort to the left and right parts.

Time Complexity:

  • Best & Average: O(n log n)
  • Worst (unbalanced partition): O(n²)

Key Characteristics:

  • In-place
  • Not stable
  • Very fast in practice with good pivot selection

A* Algorithm

The A* algorithm finds the shortest path by combining actual cost from the start (g) and estimated cost to the goal (h).
It selects nodes with the lowest total cost f(n) = g(n) + h(n).
It uses a priority queue to explore the most promising paths first.
The heuristic h(n) must not overestimate to ensure optimality.
It’s widely used in games, maps, and AI for efficient pathfinding.

A* Algorithm

Minimax Algorithm

The Minimax algorithm is used in two-player games to find the optimal move by assuming both players play optimally.
It recursively explores all possible moves, maximizing the player’s score and minimizing the opponent’s.
“Max” tries to get the highest score; “Min” tries to get the lowest.
The game tree is evaluated using a scoring function at terminal states.
It’s often optimized with alpha-beta pruning to skip unnecessary branches.

Minimax Algorithm

2. Operating System

3. Computer Architecture

4. Formal Language & Automata

5. Programming Language

6. Machine Learning

7. Digital Circuit

8. Computer Networks

Distributed Hash Table (DHT)

A Distributed Hash Table (DHT) is a decentralized system that provides a lookup service similar to a hash table: it stores (key, value) pairs and allows efficient retrieval of the value given a key. Instead of relying on a central server, DHT distributes the data across a network of nodes (often used in peer-to-peer networks and IoT systems).

Key Features:

  • Scalable: Works well even with thousands of nodes.
  • Fault-tolerant: Data is replicated, so the system can handle node failures.
  • Efficient: Most DHTs (like Chord or Kademlia) can find data in O(log n) time.

DHTs are often used in applications like BitTorrent and distributed file systems, and are relevant in wireless and digital communication contexts, which aligns well with your research interests in IoT.

DHT

9. Cryptography

10. Digital Signal Processing

11. Control Engineering

12. Software Development

13. Robotics


CS Concepts
http://toutou.zeabur.app/2025/02/09/CS-Concepts/
Author
toutou
Posted on
February 9, 2025
Licensed under