ClickHouse和他的朋友们(3)MySQL Protocol和Write调用栈

原文出处:https://bohutang.me/2020/06/08/clickhouse-and-friends-mysql-protocol-write-stack/
上篇的MySQL Protocol和Read调用里介绍了 ClickHouse 一条查询语句的调用栈,本文继续介绍写的调用栈,开整。
Write请求建表:mysql>CREATETABLEtest(aUInt8,bUInt8,cUInt8)ENGINE=MergeTree()PARTITIONBY(a,b)ORDERBYc;QueryOK,0rowsaffected(0.03sec)写入数据:INSERTINTOtestVALUES(1,1,1),(2,2,2);调用栈分析1. 获取存储引擎 OutputStreamDB::StorageMergeTree::write(std::__1::shared_ptr<DB::IAST>const&,DB::Contextconst&)StorageMergeTree.cpp:174DB::PushingToViewsBlockOutputStream::PushingToViewsBlockOutputStream(std::__1::shared_ptr<DB::IStorage>const&,DB::Contextconst&,std::__1::shared_ptr<DB::IAST>const&,bool)PushingToViewsBlockOutputStream.cpp:110DB::InterpreterInsertQuery::execute()InterpreterInsertQuery.cpp:229DB::executeQueryImpl(constchar*,constchar*,DB::Context&,bool,DB::QueryProcessingStage::Enum,bool,DB::ReadBuffer*)executeQuery.cpp:364DB::executeQuery(DB::ReadBuffer&,DB::WriteBuffer&,bool,DB::Context&,std::__1::function<void(std::__1::basic_string<char,std::__1::char_traits<char>,std::__1::allocator<char>>const&,std::__1::basic_string<char,std::__1::char_traits<char>,std::__1::allocator<char>>const&,std::__1::basic_string<char,std::__1::char_traits<char>,std::__1::allocator<char>>const&,std::__1::basic_string<char,std::__1::char_traits<char>,std::__1::allocator<char>>const&)>)executeQuery.cpp:696DB::MySQLHandler::comQuery(DB::ReadBuffer&)MySQLHandler.cpp:311DB::MySQLHandler::run()MySQLHandler.cpp:1412. 从 SQL 组装 InputStream(1,1,1), (2,2,2)如何组装成 inputstream 结构呢?
DB::InputStreamFromASTInsertQuery::InputStreamFromASTInsertQuery(std::__1::shared_ptr<DB::IAST>const&,DB::ReadBuffer*,DB::InterpreterInsertQuery::execute()InterpreterInsertQuery.cpp:300DB::executeQueryImpl(charconst*,charconst*,DB::Context&,bool,DB::QueryProcessingStage::Enum,bool,DB::ReadBuffer*)executeQuery.cpp:386DB::MySQLHandler::comQuery(DB::ReadBuffer&)MySQLHandler.cpp:313DB::MySQLHandler::run()MySQLHandler.cpp:150然后
res.in=std::make_shared<InputStreamFromASTInsertQuery>(query_ptr,nullptr,query_sample_block,context,nullptr);res.in=std::make_shared<NullAndDoCopyBlockInputStream>(res.in,out_streams.at(0));通过 NullAndDoCopyBlockInputStream的 copyData 方法构造出 Block:
DB::ValuesBlockInputFormat::readRow(std::__1::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>,std::__1::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn>>>&,unsignedlong)ValuesBlockInputFormat.cpp:93DB::ValuesBlockInputFormat::generate()ValuesBlockInputFormat.cpp:55DB::ISource::work()ISource.cpp:48DB::InputStreamFromInputFormat::readImpl()InputStreamFromInputFormat.h:48DB::IBlockInputStream::read()IBlockInputStream.cpp:57DB::InputStreamFromASTInsertQuery::readImpl()InputStreamFromASTInsertQuery.h:31DB::IBlockInputStream::read()IBlockInputStream.cpp:57voidDB::copyDataImpl<DB::copyData(DB::IBlockInputStream&,DB::IBlockOutputStream&,std::__1::atomic<bool>*)::$_0&,void(&)(DB::Blockconst&)>(DB::IBlockInputStream&,DB::IBlockOutputStream&,DB::copyData(DB::IBlockInputStream&,DB::IBlockOutputStream&,std::__1::atomic<bool>*)::$_0&,void(&)(DB::Blockconst&))copyData.cpp:26DB::copyData(DB::IBlockInputStream&,DB::IBlockOutputStream&,std::__1::atomic<bool>*)copyData.cpp:62DB::NullAndDoCopyBlockInputStream::readImpl()NullAndDoCopyBlockInputStream.h:47DB::IBlockInputStream::read()IBlockInputStream.cpp:57voidDB::copyDataImpl<std::__1::function<bool()>const&,std::__1::function<void(DB::Blockconst&)>const&>(DB::IBlockInputStream&,DB::IBlockOutputStream&,std::__1::function<bool()>const&,std::__1::function<void(DB::Blockconst&)>const&)copyData.cpp:26DB::copyData(DB::IBlockInputStream&,DB::IBlockOutputStream&,std::__1::function<bool()>const&,std::__1::function<void(DB::Blockconst&)>const&)copyData.cpp:73DB::executeQuery(DB::ReadBuffer&,DB::WriteBuffer&,bool,DB::Context&,std::__1::function<void(std::__1::basic_string<char,std::__1::char_traits<char>,std::__1::allocator<char>>const&,std::__1::basic_string<char,std::__1::char_traits<char>,std::__1::allocator<char>>const&,std::__1::basic_string<char,std::__1::char_traits<char>,std::__1::allocator<char>>const&,std::__1::basic_string<char,std::__1::char_traits<char>,std::__1::allocator<char>>const&)>)executeQuery.cpp:785DB::MySQLHandler::comQuery(DB::ReadBuffer&)MySQLHandler.cpp:313DB::MySQLHandler::run()MySQLHandler.cpp:1503. 组装 OutputStreamDB::InterpreterInsertQuery::execute()InterpreterInsertQuery.cpp:107DB::executeQueryImpl(constchar*,constchar*,DB::Context&,bool,DB::QueryProcessingStage::Enum,bool,DB::ReadBuffer*)executeQuery.cpp:364DB::executeQuery(DB::ReadBuffer&,DB::WriteBuffer&,bool,DB::Context&,std::__1::function<void(std::__1::basic_string<char,std::__1::char_traits<char>,std::__1::allocator<char>>const&,std::__1::basic_string<char,std::__1::char_traits<char>,std::__1::allocator<char>>const&,std::__1::basic_string<char,std::__1::char_traits<char>,std::__1::allocator<char>>const&,std::__1::basic_string<char,std::__1::char_traits<char>,std::__1::allocator<char>>const&)>)executeQuery.cpp:696DB::MySQLHandler::comQuery(DB::ReadBuffer&)MySQLHandler.cpp:311DB::MySQLHandler::run()MySQLHandler.cpp:141组装顺序:
NullAndDoCopyBlockInputStreamCountingBlockOutputStreamAddingDefaultBlockOutputStreamSquashingBlockOutputStreamPushingToViewsBlockOutputStreamMergeTreeBlockOutputStream4. 写入OutputStreamDB::MergeTreeBlockOutputStream::write(DB::Blockconst&)MergeTreeBlockOutputStream.cpp:17DB::PushingToViewsBlockOutputStream::write(DB::Blockconst&)PushingToViewsBlockOutputStream.cpp:145DB::SquashingBlockOutputStream::finalize()SquashingBlockOutputStream.cpp:30DB::SquashingBlockOutputStream::writeSuffix()SquashingBlockOutputStream.cpp:50DB::AddingDefaultBlockOutputStream::writeSuffix()AddingDefaultBlockOutputStream.cpp:25DB::CountingBlockOutputStream::writeSuffix()CountingBlockOutputStream.h:37DB::copyDataImpl<DB::copyData(DB::IBlockInputStream&,DB::IBlockOutputStream&,std::__1::atomic<bool>*)::<lambda()>&,void(&)(constDB::Block&)>(DB::IBlockInputStream&,DB::IBlockOutputStream&,<lambda()>&,void(&)(constDB::Block&))copyData.cpp:52DB::copyData(DB::IBlockInputStream&,DB::IBlockOutputStream&,std::__1::atomic<bool>*)copyData.cpp:138DB::NullAndDoCopyBlockInputStream::readImpl()NullAndDoCopyBlockInputStream.h:57DB::IBlockInputStream::read()IBlockInputStream.cpp:60voidDB::copyDataImpl<std::__1::function<bool()>const&,std::__1::function<void(DB::Blockconst&)>const&>(DB::IBlockInputStream&,DB::IBlockOutputStream&,std::__1::function<bool()>const&,std::__1::function<void(DB::Blockconst&)>const&)copyData.cpp:29DB::copyData(DB::IBlockInputStream&,DB::IBlockOutputStream&,std::__1::function<bool()>const&,std::__1::function<void(DB::Blockconst&)>const&)copyData.cpp:154DB::executeQuery(DB::ReadBuffer&,DB::WriteBuffer&,bool,DB::Context&,std::__1::function<void(std::__1::basic_string<char,std::__1::char_traits<char>,std::__1::allocator<char>>const&,std::__1::basic_string<char,std::__1::char_traits<char>,std::__1::allocator<char>>const&,std::__1::basic_string<char,std::__1::char_traits<char>,std::__1::allocator<char>>const&,std::__1::basic_string<char,std::__1::char_traits<char>,std::__1::allocator<char>>const&)>)executeQuery.cpp:748DB::MySQLHandler::comQuery(DB::ReadBuffer&)MySQLHandler.cpp:311DB::MySQLHandler::run()MySQLHandler.cpp:141通过 copyData 方法,让数据在 OutputStream 间层层透传,一直到 MergeTreeBlockOutputStream。
5. 返回 ClientDB::MySQLOutputFormat::finalize()MySQLOutputFormat.cpp:62DB::IOutputFormat::doWriteSuffix()IOutputFormat.h:78DB::OutputStreamToOutputFormat::writeSuffix()OutputStreamToOutputFormat.cpp:18DB::MaterializingBlockOutputStream::writeSuffix()MaterializingBlockOutputStream.h:22voidDB::copyDataImpl<std::__1::function<bool()>const&,std::__1::function<void(DB::Blockconst&)>const&>(DB::IBlockInputStream&,DB::IBlockOutputStream&,std::__1::function<bool()>const&,std::__1::function<void(DB::Blockconst&)>const&)copyData.cpp:52DB::copyData(DB::IBlockInputStream&,DB::IBlockOutputStream&,std::__1::function<bool()>const&,std::__1::function<void(DB::Blockconst&)>const&)copyData.cpp:154DB::executeQuery(DB::ReadBuffer&,DB::WriteBuffer&,bool,DB::Context&,std::__1::function<void(std::__1::basic_string<char,std::__1::char_traits<char>,std::__1::allocator<char>>const&,std::__1::basic_string<char,std::__1::char_traits<char>,std::__1::allocator<char>>const&,std::__1::basic_string<char,std::__1::char_traits<char>,std::__1::allocator<char>>const&,std::__1::basic_string<char,std::__1::char_traits<char>,std::__1::allocator<char>>const&)>)executeQuery.cpp:748DB::MySQLHandler::comQuery(DB::ReadBuffer&)MySQLHandler.cpp:311DB::MySQLHandler::run()MySQLHandler.cpp:141总结INSERTINTOtestVALUES(1,1,1),(2,2,2);首先内核解析 SQL 语句生成 AST,根据 AST 获取 Interpreter:InterpreterInsertQuery。其次 Interpreter 依次添加相应的 OutputStream。然后从 InputStream 读取数据,写入到 OutputStream,stream 会层层渗透,一直写到底层的存储引擎。最后写入到 Socket Output,返回结果。
ClickHouse 的 OutputStream 编排还是比较复杂,缺少类似 Pipeline 的调度和编排,但是由于模式比较固化,目前看还算清晰。
文内链接ClickHouse和他的朋友们(2)MySQL Protocol和Read调用栈全文完。
Enjoy ClickHouse:)
叶老师的「MySQL核心优化」大课已升级到MySQL 8.0,扫码开启MySQL 8.0修行之旅吧

版权声明