本篇內容主要講解“PostgreSQL中的Tuplesortstate數據結構是怎樣的”,感興趣的朋友不妨來看看。本文介紹的方法操作簡單快捷,實用性強。下面就讓小編來帶大家學習“PostgreSQL中的Tuplesortstate數據結構是怎樣的”吧!
Tuplesortstate
Tuplesort操作的私有狀態.
/* * Possible states of a Tuplesort object. These denote the states that * persist between calls of Tuplesort routines. * Tuplesort對象可能的狀態. * 這些標示在Tuplesort例程調用之間會持久存在的狀態. */ typedef enum { //裝載元組,在內存限制之內 TSS_INITIAL, /* Loading tuples; still within memory limit */ //裝載元組到有界堆中 TSS_BOUNDED, /* Loading tuples into bounded-size heap */ //裝載元組,寫入到tape中 TSS_BUILDRUNS, /* Loading tuples; writing to tape */ //完全在內存中完成排序 TSS_SORTEDINMEM, /* Sort completed entirely in memory */ //完成排序,最后在tape上執行排序 TSS_SORTEDONTAPE, /* Sort completed, final run is on tape */ //不落地執行最后的歸并 TSS_FINALMERGE /* Performing final merge on-the-fly */ } TupSortStatus; /* * Parameters for calculation of number of tapes to use --- see inittapes() * and tuplesort_merge_order(). * 用于計算需要使用多少個tapes的參數.--- 詳細參見inittapes()和tuplesort_merge_order(). * * In this calculation we assume that each tape will cost us about 1 blocks * worth of buffer space. This ignores the overhead of all the other data * structures needed for each tape, but it's probably close enough. * 在這個計算中,我們假定每一個tape會大概消耗緩存空間的一個block. * 雖然已經忽略了所有每個tape依賴的其他數據結構,但已經非常接近了. * * MERGE_BUFFER_SIZE is how much data we'd like to read from each input * tape during a preread cycle (see discussion at top of file). * MERGE_BUFFER_SIZE表示在每一輪讀周期中我們將要從每個輸入taple中讀取的數據大小 */ #define MINORDER 6 /* minimum merge order */ #define MAXORDER 500 /* maximum merge order */ #define TAPE_BUFFER_OVERHEAD BLCKSZ #define MERGE_BUFFER_SIZE (BLCKSZ * 32) typedef int (*SortTupleComparator) (const SortTuple *a, const SortTuple *b, Tuplesortstate *state); /* * Private state of a Tuplesort operation. * Tuplesort操作的私有狀態. */ struct Tuplesortstate { //狀態 : 枚舉值詳見上面的信息 TupSortStatus status; /* enumerated value as shown above */ //sort key中的列數 int nKeys; /* number of columns in sort key */ //調用者需要隨機訪問? bool randomAccess; /* did caller request random access? */ //調用者是否指定了最大返回的元組的數目? bool bounded; /* did caller specify a maximum number of * tuples to return? */ //使用有界堆,則返回T bool boundUsed; /* true if we made use of a bounded heap */ //如為有界堆,這里存儲最大的元組個數 int bound; /* if bounded, the maximum number of tuples */ //SortTuple.tuple是否可以設置? bool tuples; /* Can SortTuple.tuple ever be set? */ //剩余可用內存大小(單位:字節) int64 availMem; /* remaining memory available, in bytes */ //允許的內存總大小(單位:字節) int64 allowedMem; /* total memory allowed, in bytes */ //tapes個數 int maxTapes; /* number of tapes (Knuth's T) */ //tapes個數 - 1 int tapeRange; /* maxTapes-1 (Knuth's P) */ //主要用于排序數據的內存上下文 MemoryContext sortcontext; /* memory context holding most sort data */ //用于元組數據的sortcontext的子上下文 MemoryContext tuplecontext; /* sub-context of sortcontext for tuple data */ //臨時文件中tapes的logtape.c對象 LogicalTapeSet *tapeset; /* logtape.c object for tapes in a temp file */ /* * These function pointers decouple the routines that must know what kind * of tuple we are sorting from the routines that don't need to know it. * They are set up by the tuplesort_begin_xxx routines. * 這些函數指針將必須知道排序的哪種元組的例程與不需要知道它的例程解耦. * * Function to compare two tuples; result is per qsort() convention, ie: * <0, 0, >0 according as a<b, a=b, a>b. The API must match * qsort_arg_comparator. * 比較兩個元組的函數,結果由每個qsort()約定,比如: * < 0, 0, >0代表a<b,a=b,a>b.API必須匹配qsort_arg_comparator. */ SortTupleComparator comparetup; /* * Function to copy a supplied input tuple into palloc'd space and set up * its SortTuple representation (ie, set tuple/datum1/isnull1). Also, * state->availMem must be decreased by the amount of space used for the * tuple copy (note the SortTuple struct itself is not counted). * 該函數用于拷貝一個輸入的元組到由palloc分配的內存空間中, * 同時設置SortTuple數據結構(比如設置tuple/datum1/isnull1等). * 同時,state->availMem必須減去用于元組拷貝的空間大小(注意:SortTuple結構體不計算在內). */ void (*copytup) (Tuplesortstate *state, SortTuple *stup, void *tup); /* * Function to write a stored tuple onto tape. The representation of the * tuple on tape need not be the same as it is in memory; requirements on * the tape representation are given below. Unless the slab allocator is * used, after writing the tuple, pfree() the out-of-line data (not the * SortTuple struct!), and increase state->availMem by the amount of * memory space thereby released. * 用于寫入元組到taple的函數. * tape中的元組聲明不需要與內存中的一致,tape中的聲明要求詳見下面說明. * 除非使用slab分配器,在寫入元組后,pfree() out-of-line的數據(不是SortTuple結構體), * 同時把剛才釋放的內存空間加到state->availMem中. */ void (*writetup) (Tuplesortstate *state, int tapenum, SortTuple *stup); /* * Function to read a stored tuple from tape back into memory. 'len' is * the already-read length of the stored tuple. The tuple is allocated * from the slab memory arena, or is palloc'd, see readtup_alloc(). * 從tape中讀取元組到內存中的函數. * 'len'是已讀取的存儲元組的長度.元組在slab內存空間/palloc中分配,詳細參考readtup_alloc()函數 */ void (*readtup) (Tuplesortstate *state, SortTuple *stup, int tapenum, unsigned int len); /* * This array holds the tuples now in sort memory. If we are in state * INITIAL, the tuples are in no particular order; if we are in state * SORTEDINMEM, the tuples are in final sorted order; in states BUILDRUNS * and FINALMERGE, the tuples are organized in "heap" order per Algorithm * H. In state SORTEDONTAPE, the array is not used. * 該數組保存排序內存中的元組.當前狀態為 * INITIAL:元組沒有特定的順序; * SORTEDINMEM:元組處于最終已排序的狀態; * BUILDRUNS/FINALMERGE:元組按算法H的'堆'順序組織. * SORTEDONTAPE:數組未使用. */ //SortTuple結構體數組 SortTuple *memtuples; /* array of SortTuple structs */ //當前存在的元組數 int memtupcount; /* number of tuples currently present */ //memtuples數組的已分配的大小 int memtupsize; /* allocated length of memtuples array */ //memtuples的增長仍在進行中? bool growmemtuples; /* memtuples' growth still underway? */ /* * Memory for tuples is sometimes allocated using a simple slab allocator, * rather than with palloc(). Currently, we switch to slab allocation * when we start merging. Merging only needs to keep a small, fixed * number of tuples in memory at any time, so we can avoid the * palloc/pfree overhead by recycling a fixed number of fixed-size slots * to hold the tuples. * 有時候元組的內存分配使用簡單的slab分配器實現而不是palloc(). * 在開始歸并時,同步切換至slab分配器.歸并只需要在內存中保持簡單/固定的元組數目, * 因此可以避免頻繁回收固定數目固定大小的slots(用于保存元組)而導致的palloc/pfree過載. * * For the slab, we use one large allocation, divided into SLAB_SLOT_SIZE * slots. The allocation is sized to have one slot per tape, plus one * additional slot. We need that many slots to hold all the tuples kept * in the heap during merge, plus the one we have last returned from the * sort, with tuplesort_gettuple. * 對于slab,使用大型分配器,拆分為SLAB_SLOT_SIZE個大小的slots. * 分配器的大小為每個tape一個slot,外加一個為的slot. * 我們需要這么多slots是因為在歸并期間需要保存所有在堆中的元組,外加tuplesort_gettuple從排序中最終返回的元組 * * Initially, all the slots are kept in a linked list of free slots. When * a tuple is read from a tape, it is put to the next available slot, if * it fits. If the tuple is larger than SLAB_SLOT_SIZE, it is palloc'd * instead. * 一開始,所有的slots在空閑slots中以鏈表的方式保存. * 從tape中讀取元組時,如合適的話,元組會放到下一個可用slot中. * 如果元組比SLAB_SLOT_SIZE要大,改用palloc分配內存空間. * * When we're done processing a tuple, we return the slot back to the free * list, or pfree() if it was palloc'd. We know that a tuple was * allocated from the slab, if its pointer value is between * slabMemoryBegin and -End. * 如果已完成元組的處理,返回slot到空閑鏈表中,如果使用palloc分配則使用pfree回收空間. * 如果元組指針值在slabMemoryBegin和slabMemoryEnd之間,那么我們可以知道元組是從slab中分配的. * * When the slab allocator is used, the USEMEM/LACKMEM mechanism of * tracking memory usage is not used. * 如使用了slab分配器,不會使用USEMEM/LACKMEM機制跟蹤內存使用. */ bool slabAllocatorUsed; //slab內存空間的起始位置 char *slabMemoryBegin; /* beginning of slab memory arena */ //slab內存空間的結束位置 char *slabMemoryEnd; /* end of slab memory arena */ //鏈表頭 SlabSlot *slabFreeHead; /* head of free list */ /* Buffer size to use for reading input tapes, during merge. */ //在歸并期間用于讀取輸入tapes的緩存大小 size_t read_buffer_size; /* * When we return a tuple to the caller in tuplesort_gettuple_XXX, that * came from a tape (that is, in TSS_SORTEDONTAPE or TSS_FINALMERGE * modes), we remember the tuple in 'lastReturnedTuple', so that we can * recycle the memory on next gettuple call. * 通過tuplesort_gettuple_XXX方法調用返回元組給調用者時,元組從tape中獲取 * (也就是說,TSS_SORTEDONTAPE/TSS_FINALMERGE模式),這時候會把元組放在'lastReturnedTuple'中, * 因此可在下次gettuple調用中回收內存. */ void *lastReturnedTuple; /* * While building initial runs, this is the current output run number. * Afterwards, it is the number of initial runs we made. * 在構建initial運行時,這是當前輸出的run編號. * 后續這是我們構建好的initial runs的編號. */ int currentRun; /* * Unless otherwise noted, all pointer variables below are pointers to * arrays of length maxTapes, holding per-tape data. * 除非特別注意,下面所有的指針變量是指向長度為maxTapes的數組,保存per-tape數據. */ /* * This variable is only used during merge passes. mergeactive[i] is true * if we are reading an input run from (actual) tape number i and have not * yet exhausted that run. * 該變量在歸并過程中使用. * mergeactive[i]為T如果我們從編號為i的tape中讀取數據,仍未在該run中消耗完畢. */ //活躍的輸入run源? bool *mergeactive; /* active input run source? */ /* * Variables for Algorithm D. Note that destTape is a "logical" tape * number, ie, an index into the tp_xxx[] arrays. Be careful to keep * "logical" and "actual" tape numbers straight! * 用于算法D的變量. * 注意destTape是一個邏輯tape編號,例如,是指向tp_xxx[]數組的索引. * 注意保持"邏輯"和"實際"tape編號的連續性. */ //Knuth's l int Level; /* Knuth's l */ //當前輸出tape(Knuth's j) int destTape; /* current output tape (Knuth's j, less 1) */ //目標斐波拉契run計數(A[]) int *tp_fib; /* Target Fibonacci run counts (A[]) */ //每一個tape上真正runs的編號 int *tp_runs; /* # of real runs on each tape */ //每一個tape(D[])上虛擬runs的編號 int *tp_dummy; /* # of dummy runs for each tape (D[]) */ //實際的tape編號(TAPE[]) int *tp_tapenum; /* Actual tape numbers (TAPE[]) */ //歸并輪中的活動輸入tapes編號 int activeTapes; /* # of active input tapes in merge pass */ /* * These variables are used after completion of sorting to keep track of * the next tuple to return. (In the tape case, the tape's current read * position is also critical state.) * 這些變量用于在排序完成后保持下一個返回元組時的跟蹤. * (在tape情況下,tape的當前讀取位置也是重要的狀態) */ //已完成輸出的實際tape編號 int result_tape; /* actual tape number of finished output */ //數組編號(僅用于SORTEDINMEM) int current; /* array index (only used if SORTEDINMEM) */ //是否到達EOF(用于游標) bool eof_reached; /* reached EOF (needed for cursors) */ /* markpos_xxx holds marked position for mark and restore */ //markpos_xxx保持已標記的位置,用于標記和存儲 //tape block編號(只用于SORTEDONTAPE) long markpos_block; /* tape block# (only used if SORTEDONTAPE) */ //存儲的"current",或者tape塊中的偏移 int markpos_offset; /* saved "current", or offset in tape block */ //存儲的eof_reached bool markpos_eof; /* saved "eof_reached" */ /* * These variables are used during parallel sorting. * 這些變量用于并行排序. * * worker is our worker identifier. Follows the general convention that * -1 value relates to a leader tuplesort, and values >= 0 worker * tuplesorts. (-1 can also be a serial tuplesort.) * worker是worker標識符ID. * 遵循一般約定,-1值與leader tuplesort相關,并且值>= 0表示worker tuplesorts。 * (在串行tuplesort時,-1也可以表示這種情況) * * shared is mutable shared memory state, which is used to coordinate * parallel sorts. * shared是可變的共享內存狀態,用于協調并行排序. * * nParticipants is the number of worker Tuplesortstates known by the * leader to have actually been launched, which implies that they must * finish a run leader can merge. Typically includes a worker state held * by the leader process itself. Set in the leader Tuplesortstate only. * nParticipants是已知的worker Tuplesortstates的數目,這些worker由leader感知,是實際啟動的worker數, * 這也意味著在leader可以歸并前這些worker必須完成. * 典型的,leader進行自身包含至少一個worker狀態. * 只在leader的Tuplesortstate中設置. */ int worker; Sharedsort *shared; int nParticipants; /* * The sortKeys variable is used by every case other than the hash index * case; it is set by tuplesort_begin_xxx. tupDesc is only used by the * MinimalTuple and CLUSTER routines, though. * sortKeys變量用于every而不是hash index,通過tuplesort_begin_xxx設置. * tupDesc只由MinimalTuple和CLUSTER例程使用. */ TupleDesc tupDesc; //長度nKeys數組 SortSupport sortKeys; /* array of length nKeys */ /* * This variable is shared by the single-key MinimalTuple case and the * Datum case (which both use qsort_ssup()). Otherwise it's NULL. * 該變量在單鍵MinimalTuple和Datum情況下(使用qsort_ssup()函數)共享使用,否則的話值為NULL. */ SortSupport onlyKey; /* * Additional state for managing "abbreviated key" sortsupport routines * (which currently may be used by all cases except the hash index case). * Tracks the intervals at which the optimization's effectiveness is * tested. * 管理"縮寫鍵"sortsupport過程的額外狀態. * (除了hash index外會被其他情況使用) * 跟蹤在優化器有效性測試時時間間隔. */ int64 abbrevNext; /* Tuple # at which to next check * applicability */ /* * These variables are specific to the CLUSTER case; they are set by * tuplesort_begin_cluster. * 這些變量僅在CLUSTER時生效,通過tuplesort_begin_cluster設置. */ //將用于依賴的索引信息 IndexInfo *indexInfo; /* info about index being used for reference */ //解析索引表達式的運行期狀態 EState *estate; /* for evaluating index expressions */ /* * These variables are specific to the IndexTuple case; they are set by * tuplesort_begin_index_xxx and used only by the IndexTuple routines. * 這些變量僅用于IndexTuple. * 通過tuplesort_begin_index_xxx設置,僅用于IndexTuple例程. */ //數據表 Relation heapRel; /* table the index is being built on */ //正在創建的index Relation indexRel; /* index being built */ /* These are specific to the index_btree subcase: */ //這些僅在index_btree下使用 //如發現重復元組,則提示 bool enforceUnique; /* complain if we find duplicate tuples */ /* These are specific to the index_hash subcase: */ //index_hash情況 uint32 high_mask; /* masks for sortable part of hash code */ uint32 low_mask; uint32 max_buckets; /* * These variables are specific to the Datum case; they are set by * tuplesort_begin_datum and used only by the DatumTuple routines. * 這些變量用于Datum,通過tuplesort_begin_datum設置,僅用于DatumTuple例程. */ Oid datumType; /* we need typelen in order to know how to copy the Datums. */ //需要typelen用于知道如何拷貝Datums. int datumTypeLen; /* * Resource snapshot for time of sort start. * 在排序開始時的資源快照 */ #ifdef TRACE_SORT PGRUsage ru_start; #endif };
到此,相信大家對“PostgreSQL中的Tuplesortstate數據結構是怎樣的”有了更深的了解,不妨來實際操作一番吧!這里是億速云網站,更多相關內容可以進入相關頻道進行查詢,關注我們,繼續學習!
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。