Parallelizing Programs on the Blockchain
EVM-based blockchains are often too difficult to scale past 1,000 transactions per second. Transactions can reference shared states and dynamically call into other contracts. This means that transactions must be executed in serial. On the other hand, program writers don't need to declaratively enumerate their dependencies (e.g., state or other programs).
The first way to gain parallelization is to do away with dynamic function calling. Both Solana's Sealevel runtime and Move (Aptos/Sui) use static dispatch rather than dynamic. Sealevel enforces this at the "operating system" level, while Move does it as a domain-specific language (DSL).
Optimistic concurrency control executes transactions in parallel and then verifies that there are no conflicts by recording memory accesses. Aptos uses a version of this they call Block-STM (paper).
Collaborative scheduling can add another layer of optimization by ordering transactions in a way to increase throughput. However, you can also schedule transactions in a way to maximize extractable value.
Normal horizontal scaling doesn't work because, in permissionless systems, nodes often have to store the entire state of the system. Sharding (for transaction execution, not data availability) hasn't been done yet.