2020-08-05 14:25:15 +08:00
2020-07-31 17:55:04 +08:00
2020-08-04 12:21:34 +08:00
2020-07-29 13:42:26 +08:00
2020-08-05 14:25:15 +08:00
2020-07-29 13:42:26 +08:00
2020-07-28 17:52:50 +08:00
2020-07-29 13:42:26 +08:00
2020-07-29 13:42:26 +08:00
2020-07-28 17:52:50 +08:00
2020-08-03 18:28:18 +08:00
2020-08-03 19:56:40 +08:00
2020-07-29 13:42:26 +08:00

中文版

license MIT C++ platform

Sogou C++ Workflow

As the backend C++ programming standard in Sogou, Workflow is an industrial-grade programming engine.

Main functions and features:

  • An asynchronous engine based on C++11 std::function which aims to solve all the serial, parallel and asynchronous problems.
  • As a network framework, it is completely protocol-agnostic and directly facing applications.
    • It can either be used as a Redis client or an Http server.
    • Convinient to customize protocols, so you can quickly build your own RPC systems.
      • Sogou RPC is developed based on Sogou Workflow and is open source as an independent project. The project supports srpc, brpc and thrift protocols (benchmark).
    • Support SSL (depends on openssl). Support TCP, UDP, SCTP and other common transport layer protocols. Support SSL on SCTP. Not support UDP server.
  • A variety of common Internet protocol implementations are natively integrated and used in a unified way.
    • Currently support http, redis, mysql and kafka protocols. You can directly access these resources or build servers for these protocols.
    • Highly likely the only C++ full-featured mysql asynchronous client on the market.
    • DNS protocol is being developing and currently we use the system library to access DNS.
  • Powerful feature for scheduling computing tasks
    • Computing task, as well as communication task, can be added into the task flow and theyre scheduled separately by their corresponding scheduler.
    • You can use it as a parallel programming engine without the network features.
    • Our biggest goal is to maximize the performance of every node when the calculation and communication environment is very complex.
    • Some common algorithm implementations are provided, such as parallel sorting and MapReduce.
    • In fact, all asynchronous processes (such as disk IO, GPU tasks, timers, etc.) can be scheduled in coordination.
      • On the Linux system, the disk IO task is realized through the Linux underlying aio, which is extremely efficient.
  • Support any task flow with DAG structure. However, in most cases, users only need series-parallel structure.
  • Built-in load balancing and powerful service governance features.
  • Easily used in conjunction with other asynchronous engines.
  • Streaming communication engine is being developed.
  • When working as a server, it supports multi-processes mode and precisely supports graceful restart.

Building

  • Support Linux, macOS, FreeBSD, Windows and other systems so far. Installing cmake is necessary.
    • Windows version is temporarily released as an independent branch, which uses iocp as the basis for asynchronous communication and mean while, keeping the same external interface.
  • As written in C/C++, it requires the users being able to proficiently use C++ programing. It does not rely on boost or asio, therefor the compiling speed is extremely fast.
  • It contains a small numbers of C++11 features, so users need to being able to use std::function and std::move.
  • Theoretically support all CPU architectures and can be compiled and run on 32-bit or 64-bit arm processors. Big endian CPU is not tested.
  • openssl is required. If users expect high performance of SSL, openssl 1.1 or higher is strongly recommended.
  • No other dependencies. Several compression libraries such as snappy and lz4 is contained by their unmodified source (required by the Kafka protocol).

Some features of design

  • The basic usage is very simple and handy. Some features are designed to greatly reduce the difficulty of programming with general C++ projects.
    • To avoid users to derive as much as possible, all user behaviors are wrapped with std::function, for example:
      • the callback after every task ends
      • the algorithms in computing tasks
      • one server corresponds to one std::function
    • Trying to avoid complicated memory management, all tasks and frameworks are generated by factory classes, and their memory is recycled automatically. Which means,
      • Every task is automatically deleted after its callback.
      • If the users want to keep any data in the task (such as a network reply packet or the result of an algorithm), they need to use std::move to move it.
      • We treat memory recycle as a strict and naturally logical mechanism, so we dont use share_ptr.
    • Avoid using complicated parameter configuration.
      • Actually we have a lot of configurable parameters, though you can use our system without feeling the parameters exist.
      • If you have specific requirements for program behavior and resource ratio, you can definitely find the corresponding parameter configuration items in order to maximize the performance of you program.
  • The project adopts a fully asynchronous design and is not transparent to users, which means users need to know that they are writing asynchronous programs.
    • Thanks to the convenience brought by std::function and the automatic memory recycling mechanism, we have delicately designed the simply possible usage of asynchrony for users.
    • No user-mode threads concepts. On the one hand, performance is considered. On the other hand, we have the concept of computing tasks (threaded tasks) scheduling.
      • In our design, computing is one kind of asynchronous task, which has no differences from communication.
      • Computing tasks are scheduled by independent thread groups according to specific algorithms, please note that they may not be executed immediately.
      • As we have such computing tasks, user-mode threads become meaningless, and therefore users must understand asynchrony.
    • Because of the full asynchrony, almost all core calls are short and non-blocking operations.
      • Thats why we dont recommend users to block their programs in callback or do some complex calculations. However, it acceptable if the logic is quite simple.
  • Brief summary of the usage: * The user builds the program just like building a series-parallel circuit. The circuit can be generated at the beginning or dynamically during the program running. * We provide various electronic components for users. For instance, one http request, one GPU matrix multiplication, and one parallel sorting can all be understood as a electronic component. * Every electronic component has its standard input and output. At the meantime, every electronic component can be a complicated circuit, which has no necessary to be perceived by the users. * For example, an http request may go through multiple asynchronous processes such as DNS, redirect, and retry, but the entire processes is just a component in the perspective of the users. * Users can easily define their own components, including algorithms and some kind of communication.
    • To implement stateless protocols is extremely simple. It may be a little bit complicated when the protocol includes login, library selection, etc., at this time, you can refer to the redis implementation. * Through the powerful Upstream system, complex service governance can be realized, such as communication node selection, load balancing, circuit breaker and recovery, master and slave, etc. * In conclusion, this is an enterprise-level, elegantly designed asynchronous framework which can cover almost all high-performance back-end service requirements.

Tutorials

Authors

Description
C++ Parallel Computing and Asynchronous Networking Framework
Readme 15 MiB
Languages
C++ 75.9%
C 21.1%
CMake 1.4%
Lua 0.8%
Starlark 0.5%
Other 0.3%