mirror of
https://github.com/sogou/workflow.git
synced 2026-02-08 01:33:17 +08:00
148 lines
9.0 KiB
Markdown
148 lines
9.0 KiB
Markdown
[中文版入口](README_cn.md)
|
||
|
||
## Sogou C++ Workflow
|
||
|
||
[](https://github.com/sogou/workflow/blob/master/LICENSE)
|
||
[](https://en.cppreference.com/)
|
||
[](https://img.shields.io/badge/platform-linux%20%7C%20macos20%7C%20windows-lightgrey.svg)
|
||
[](https://travis-ci.org/sogou/workflow)
|
||
|
||
As **Sogou\`s C++ server engine**, Sogou C++ Workflow supports almost all **back-end C++ online services** of Sogou, including all search services, cloud input method,online advertisements, etc., handling more than **10 billion** requests every day. This is an **enterprise-level programming engine** in light and elegant design which can satisfy most C++ back-end development requirements.
|
||
|
||
#### You can use it:
|
||
|
||
* To quickly build an **HTTP server**:
|
||
|
||
~~~cpp
|
||
#include <stdio.h>
|
||
#include "workflow/WFHttpServer.h"
|
||
|
||
int main()
|
||
{
|
||
WFHttpServer server([](WFHttpTask *task) {
|
||
task->get_resp()->append_output_body("<html>Hello World!</html>");
|
||
});
|
||
|
||
if (server.start(8888) == 0) { // start server on port 8888
|
||
getchar(); // press "Enter" to end.
|
||
server.stop();
|
||
}
|
||
|
||
return 0;
|
||
}
|
||
~~~
|
||
|
||
* As a **multifunctional asynchronous client**, it currently supports `HTTP`, `Redis`, `MySQL` and `Kafka` protocols.
|
||
* To implement **client/server on user-defined protocol** and build your own **RPC system**.
|
||
* [srpc](https://github.com/sogou/srpc) is based on it and it is an independent open source project, which supports srpc, brpc, trpc and thrift protocols.
|
||
* To build **asynchronous workflow**; support common **series** and **parallel** structures, and also support any **DAG** structures.
|
||
* As a **parallel computing tool**. In addition to **networking tasks**, Sogou C++ Workflow also includes **the scheduling of computing tasks**. All types of tasks can be put into **the same** flow.
|
||
* As a **asynchronous file IO tool** in `Linux` system, with high performance exceeding any system call. Disk file IO is also a task.
|
||
* To realize any **high-performance** and **high-concurrency** back-end service with a very complex relationship between computing and networking.
|
||
* To build a **micro service** system.
|
||
* This project has built-in **service governance** and **load balancing** features.
|
||
|
||
#### Compiling and running environment
|
||
|
||
* This project supports `Linux`, `macOS`, `Windows`, `Android` and other operating systems.
|
||
* `Windows` version is currently released as an independent [branch](https://github.com/sogou/workflow/tree/windows), using `iocp` to implement asynchronous networking. All user interfaces are consistent with the `Linux` version.
|
||
* Supports all CPU platforms, including 32 or 64-bit `x86` processors, big-endian or little-endian `arm` processors.
|
||
* Relies on `OpenSSL`; `OpenSSL 1.1` and above is recommended. If you don't like SSL, you may checkout the [nossl](https://github.com/sogou/workflow/tree/nossl) branch. But still need to link `crypto` for `md5` and `sha1`.
|
||
* Uses the `C++11` standard and therefore, it should be compiled with a compiler which supports `C++11`. Does not rely on `boost` or `asio`.
|
||
* No other dependencies. However, if you need `Kafka` protocol, some compression libraries should be installed, including `lz4`, `zstd` and `snappy`.
|
||
|
||
### Get started (Linux, macOS):
|
||
~~~sh
|
||
$ git clone https://github.com/sogou/workflow
|
||
$ make
|
||
$ cd tutorial
|
||
$ make
|
||
~~~~
|
||
|
||
# Tutorials
|
||
|
||
* Client
|
||
* [Creating your first task:wget](docs/en/tutorial-01-wget.md)
|
||
* [Implementing Redis set and get:redis\_cli](docs/en/tutorial-02-redis_cli.md)
|
||
* [More features about series:wget\_to\_redis](docs/en/tutorial-03-wget_to_redis.md)
|
||
* Server
|
||
* [First server:http\_echo\_server](docs/en/tutorial-04-http_echo_server.md)
|
||
* [Asynchronous server:http\_proxy](docs/en/tutorial-05-http_proxy.md)
|
||
* Parallel tasks and Series
|
||
* [A simple parallel wget:parallel\_wget](docs/en/tutorial-06-parallel_wget.md)
|
||
* Important topics
|
||
* [About error](docs/en/about-error.md)
|
||
* [About timeout](docs/en/about-timeout.md)
|
||
* [About global configuration](docs/en/about-config.md)
|
||
* [About DNS](docs/en/about-dns.md)
|
||
* [About exit](docs/en/about-exit.md)
|
||
* Computing tasks
|
||
* [Using the build-in algorithm factory:sort\_task](docs/en/tutorial-07-sort_task.md)
|
||
* [User-defined computing task:matrix\_multiply](docs/en/tutorial-08-matrix_multiply.md)
|
||
* [Use computing task in a simple way: go task](docs/en/about-go-task.md)
|
||
* Asynchronous File IO tasks
|
||
* [Http server with file IO:http\_file\_server](docs/en/tutorial-09-http_file_server.md)
|
||
* User-defined protocol
|
||
* [A simple user-defined portocol: client/server](docs/en/tutorial-10-user_defined_protocol.md)
|
||
* Timing tasks and counting tasks
|
||
* [About timer](docs/en/about-timer.md)
|
||
* [About counter](docs/en/about-counter.md)
|
||
* Service governance
|
||
* [About service governance](docs/en/about-service-management.md)
|
||
* [More documents about upstream](docs/en/about-upstream.md)
|
||
* Connection context
|
||
* [About connection context](docs/en/about-connection-context.md)
|
||
* Built-in protocols
|
||
* [Asynchronous MySQL client:mysql\_cli](docs/en/tutorial-12-mysql_cli.md)
|
||
* [Asynchronous Kafka client: kafka\_cli](docs/en/tutorial-13-kafka_cli.md)
|
||
|
||
#### System design features
|
||
|
||
We believe that a typical back-end program=protocol+algorithm+workflow and should be developed completely independently.
|
||
|
||
* Protocol
|
||
* In most cases, users use built-in common network protocols, such as HTTP, Redis or various rpc.
|
||
* Users can also easily customize user-defined network protocol. In the customization, they only need to provide serialization and deserialization functions to define their own client/server.
|
||
* Algorithm
|
||
* In our design, the algorithm is a concept symmetrical to the protocol.
|
||
* If protocol call is rpc, then algorithm call is an apc (Async Procedure Call).
|
||
* We have provided some general algorithms, such as sort, merge, psort, reduce, which can be used directly.
|
||
* Compared with a user-defined protocol, a user-defined algorithm is much more common. Any complicated computation with clear boundaries should be packaged into an algorithm.
|
||
* Workflow
|
||
* Workflow is the actual bussiness logic, which is to put the protocols and algorithms into the flow graph for use.
|
||
* The typical workflow is a closed series-parallel graph. Complex business logic may be a non-closed DAG.
|
||
* The workflow graph can be constructed directly or dynamically generated based on the results of each step. All tasks are executed asynchronously.
|
||
|
||
Basic task, task factory and complex task
|
||
|
||
* Our system contains six basic tasks: networking, file IO, CPU, GPU, timer, and counter.
|
||
* All tasks are generated by the task factory and automatically recycled after callback.
|
||
* Server task is one kind of special networking task, generated by the framework which calls the task factory, and handed over to the user through the process function.
|
||
* In most cases, the task generated by the user through the task factory is a complex task, which is transparent to the user.
|
||
* For example, an HTTP request may include many asynchronous processes (DNS, redirection), but for user, it is just a networking task.
|
||
* File sorting seems to be an algorithm, but it actually includes many complex interaction processes between file IO and CPU computation.
|
||
* If you think of business logic as building circuits with well-designed electronic components, then each electronic component may be a complex circuit.
|
||
|
||
Asynchrony and encapsulation based on `C++11 std::function`
|
||
|
||
* Not based on user mode coroutines. Users need to know that they are writing asynchronous programs.
|
||
* All calls are executed asynchronously, and there are almost no operation that occupys a thread.
|
||
* Although we also provide some facilities with semi-synchronous interfaces, they are not core features.
|
||
* We try to avoid user's derivations, and encapsulate user behavior with `std::function` instead, including:
|
||
* The callback of any task.
|
||
* Any server's process. This conforms to the `FaaS` (Function as a Service) idea.
|
||
* The realization of an algorithm is simply a `std::function`. But the algorithm can also be implemented by derivation.
|
||
|
||
Memory reclamation mechanism
|
||
|
||
* Every task will be automatically reclaimed after the callback. If a task is created but a user does not want to run it, the user needs to release it through the dismiss method.
|
||
* Any data in the task, such as the response of the network request, will also be recycled with the task. At this time, the user can use `std::move()` to move the required data.
|
||
* SeriesWork and ParallelWork are two kinds of framework objects, which are also recycled after their callback.
|
||
* When a series is a branch of a parallel, it will be recycled after the callback of the parallel that it belongs to.
|
||
* This project doesn’t use `std::shared_ptr` to manage memory.
|
||
|
||
#### More design documents
|
||
|
||
To be continued...
|
||
|