TODO - metacpan.org

# Generated by getTODO.pm on Mon Sep 27 01:46:45 2004
# for Genezzo Version 0.25 - Alpha 20040919

Some general TODO categories:

APIs:
  See if can embed genezzo in apache.
  real DBI support (DBI::Genezzo)
  web-based management console

Missing SQL features:
  Binds and joins top the list.
  Also: sorting/aggregation, subquery support, views, explain plan

Multiuser support issues:
  transactions, logging, recovery, 
  shared memory buffer cache
  exclusive table locks first, then read share/write exclusive,
  then row locks

  users/roles, sessions, schemas, tablespaces, authentication

query optimization:
  Rule/Cost-based optimization
  costing by index probe

fancier functionality:
  btrees with overflow blocks for long keys
  block migration
  block-level predicate pushdown, aggregate pushdown

  yaml datatype support

  antlr parser

  file encryption, row compression
  MLS: multi level security

  parallel/distributed operation, replication, 
  scalability, fault-tolerance

  user-defined functions, indexes, datatypes

  non-blocking aggregation based upon count estimation

  unicode support

  error messages


space management:
  lehman and yao "efficient locking for concurrent
  operations on b-trees" ACM TODS v6, #4, Dec 81, pp 650-670.

  SCN/LSN block header information

  freelists, extent headers


Per-file TODO breakdown follows:

TODO lib/Genezzo/Block/RDBlkA.pm
    HSplice: offset calculation must match offset2hkey in RDBlock. Special
    handling needed if inherited by RDBlk_NN?

TODO lib/Genezzo/Block/RDBlk_NN.pm
    build simple test cases
    build complex test cases
    test thoroughly
    packdeleted: make this work. It's broken!
    integration with bt2 - need to packdelete in bsplit, do null checks in
    leaf blocks (branch blocks should be ok)
    need a validation function to ensure that block maintains invariant:
    small number of leading metadata rows starting at row zero, followed by
    data rows (deletes ok). Easier to support non-split rows initially, but
    should be able to support head rows (need mods to splice functions to
    preserve rowstats for this case).
    need to modify metadata methods so all metadata created in first n rows.
    could simply have delete really delete the rows, so no changes necessary
    for rdblock clients (i.e., no "null rows" generated).

TODO lib/Genezzo/Block/RDBlock.pm
    use row directory rowlen vs len/value for row storage
    meta row - should binary search for meta id
    unicode support

TODO lib/Genezzo/Block/Std.pm
    Support for completely variable block headers

TODO lib/Genezzo/BufCa/BCFile.pm
    buffer cache block zero should contain description of buffer cache
    layout
    need a way to free blocks associated with a file that is not currently
    in use

TODO lib/Genezzo/Dict.pm
    DictTableAllTab: need index on allfileused for delete
    DictTableAllTab: update tsfiles for usefile
    need some combo _get_table/corecolnum/getcol - create a custom iterator
    that returns specified cols
    non-unique index support using bt2 use_keycount. Need to separate notion
    of SQL uniqueness from btree concept of unique, since a non-unique SQL
    index is a unique btree with the rid as least-significant key col (vs
    rid as value col).
    need drop table/drop index linkage, delete constraints for table, etc
    constraints: can fix check constraint in update case -- don't need to
    check insert if check columns aren't modified.
    constraints: need not null/foreign key constraints
    constraints: need to limit one primary key per table, prevent creation
    of duplicate indexes on same ordered key columns
    expose drop index, drop constraint. tie drop index/drop table?
    check usage of HCount for max tid, max fileidx, max consid. This won't
    work if have deletions
    DictTableUseFile: update space management to use this function correctly

TODO lib/Genezzo/Feeble.pm
    Use antlr (see antlr.org) to generate a parser, and toss this code.

TODO lib/Genezzo/GenDBI.pm
    This module is a bit of a catch-all, since it contains a DBI-style
    interface, an interactive loop with an interpreter and some presentation
    code, plus some expression evaluation and query planning logic. It needs
    to get split up.

TODO lib/Genezzo/Havok.pm
    Create dictionary initialization havok

TODO lib/Genezzo/Havok/UserExtend.pm
    Need to fix "import" mechanism so can load specific functions into
    Genezzo::GenDBI namespace.

TODO lib/Genezzo/Index/bt2.pm
    hkey/offset functions: should be able to convert between different
    "place" formats (Array and Hash prefixes), like the common fetch
    routine, or ASSERT that prefix matches.
    add reverse scan to search/SQLFetch
    support multicol keys, non-unique keys (via combo of key + rid as
    unique)
    support transaction unique constraints -- probably via treat key+rid as
    unique, then turn on true unique key, and scan for duplicates?
    find out why can't do pctfree=0
    Work on RDBlk_NN support.
    search with startkey/stopkey support, vs supplying compare/equal
    methods. restricting the search api to straight "=","<" comparisons
    means can try the estimation function
    need to handle partial startkey/stopkey comparison in searchR/SQLFetch
    for multi-col keys
    semantics of nulls in multi-col keys -- sort low?
    simplify _pack_row with splice and a supplied split position, something
    like -1 for normal indexes (n-1 key cols, 1 val col, so pop the val) or
    "N=?" for index-organized tables (N key cols, M val cols, so splice N)
    reorganize along the lines of "GiST" Generalized Search Trees (Paul
    Aoki, J. Hellerstein, UCB)
    ecount support?

TODO lib/Genezzo/Index/bt3.pm
    new: maybe a way to get blocksize from rstab/rsfile and pass to bt2,
    versus passing it to each layer separately
    getMainMeta from first block of tied hash, but no guarantee that space
    management is nice enough to return blocks in allocation order. Should
    store block address of leftmost leaf in index table.
    spacecheck: space cache should simply be free extents allocated to the
    index. Need to extend smfile to have multiple free extents in spacelist,
    vs just used extents. Note still an issue for simultaneous inserts --
    need lots of space for pathological case where each parallel insert
    splits a separate subtree. That's why transactions were invented.

TODO lib/Genezzo/Index/btHash.pm
    figure out whether should be a pushhash, hash, or rowsource
    SQLPrepare/Execute/Fetch: clean up. Shouldn't need to manage a
    distinction between using btHash as a row source and the old bt2 api.
    bt2 is wrong - should only have one Fetch style. Should be able to use
    the index start/stop key vs filtering.
    NEXTKEY: broken in "dump tsidx" for case where create 2 tables, insert
    some rows, then drop the first table (and don't COMMIT) and call dump
    tsidx. Loops in NEXTKEY - never terminates for allfileused index.
    Add ReadOnly mode so can view indexes, but not insert/update/delete.

TODO lib/Genezzo/Parse/FeebLex.pm
    quoted string support imperfect - case of WHERE col1="if ($foo->{baz})
    then blah();" not quite correct...

TODO lib/Genezzo/Row/RSIdx1.pm
    HSuck:
    FirstCount/NextCount: do real estimate vs fake
    should pass leftmost blockno explicitly versus rely on RSTab FIRSTKEY
    rectify some overlap between btHash and this module
    could encode multiple column key into single col rid using MIME::Base64
    encode of a packed row. should check dependency for perl 5.6 and add to
    Makefile.PL.

TODO lib/Genezzo/Row/RSTab.pm
    $href: remove - need a dict function to return allfileused via tso
    HSuck: need a way to specify packing method
    HSuck: fix trailing zero replacement
    NextCount: fix quitloop
    localPush/Store: qualify length packstr as percentage of blocksize
    (1/3?)
    localStore: race condition on rowstat
    localFetchDelete: frag flag info, delete status. Could express this
    function as a generalized "RowSplice" (as distinct from RDBlkA::HSplice,
    which is a block splice operator). Would need be able to splice based
    upon column number/array offset, as well as substring byte offset -- the
    inverse functionality of PackRow2/HSuck
    DBI - support Bind and projection (returning only certain specified
    columns, versus all columns)
    _init: change to use TSTableAFU support versus href->{filesused}
    need support for constraints that "mutate" supplied values, e.g.
    manipulate numeric precision or supply default values for columns. Also
    need support for foreign keys in delete.

TODO lib/Genezzo/SpaceMan/SMFile.pm
    support for non-table objects like indexes - done?
    freetable: when last object is freed, need to update _tsfiles as UNUSED
    need to coalesce adjacent free extents
    maintain multiple free lists for performance
    better indexing scheme - maybe a btree
    chain the block header if necessary -- allocate a new block to hold
    additional free list information, append extent allocation to HEADER row
    (after 0:1)
    check status everywhere where update rows
    maintain free extents list for each object, so can re-use extents
    (especially important for updates of large multi-block rows)

TODO lib/Genezzo/Tablespace.pm
    filearr, used, unused: should match dict _tsfiles fileidx - done 3.21?
    notion of buffercache associated the tablespace object -- possible
    multiple active bc's, with different characteristics/semantics, e.g. a
    bc for temp space with different blocksize, lacking txn recovery? Need
    to guarantee that all clients of a tso use the same bc for
    consistency/locking/txn support

TODO lib/Genezzo/Util.pm
    packrow: store metadata in col0 vs trailing col with next ptr
    packrow: check pack format for a zero len row of zero cols. Does it need
    a nullvec?
    unpackrow: extend to support a prebuilt template when unpacking many
    rows with the same number of columns. Could probably store in an array.
    if (defined($a[$numcols])...
    packrow/unpackrow: in Perl 5.8 could use the nifty repeating templates
    to our advantage.
    packrow: could generate skiplists as col zero metadata tracking byte
    position and column numbers to speed lookups
	Global
`s`	Focus search bar
`?`	Bring up this help dialog
	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)
	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse
	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)