Mysql – What logical structure is used for storing rows in Oracle

MySQLoracle

From the logical data separation point, there are tablespaces that consist of segments that consist of extend that consist of oracle blocks that consist of OS blocks. What logical structure actually hold data of concrete record (where the data of all fields of the same record reside side by side in filesystem). In the world of MySQL, there's data page (8, 16 KB) that holds one of more records at the same physical address.

Also, how to determine what size if the best? The more such pages exist the more data fragmentation will appear, and that might affect performance.

Best Answer

The block is the structure that holds row data. The Logical Storage Structure documentation has a detailed overview of all this.

Note that data for a single row can reside on more than one block (this is called row chaining), so it need not be contiguous on disk. Row migration can also leave just a "forwarding address" in the row's original block that points to the row's new block of residence.

All I/O to and from the SGA is done in block-sized chunks (possibly more than one block at a time, but, for the SGA, never smaller than a block).
The block size of the SYSTEM tablespace determines the database's "natural" block size. You can have other tablespaces with different block sizes but:

you can't mix block sizes inside a same tablespace
you will need to set up separate cache areas for each block size (you can't have for example 8k blocks and 16k blocks in the same block cache)

So in general, it is simpler to use only one block size, if only to avoid the administrative and tuning overhead of managing the size of multiple block buffers.

How you determine the optimal block size for a database is a compromise just like pretty much everything else.
Smaller blocks means your SGA is more granular, but each block has relatively more overhead. And wide rows in small blocks will be more likely to chain or migrate which can be a source of performance problems.
Larger blocks means less chances of chaining (assuming PCTFREE and PCTUSED are reasonably set). But it also means large single-block I/Os and less granularity in your SGA. Since a change in any row of one block requires the whole block to be written out to disk, larger block sizes could lead to more write I/O volume than smaller ones.
Another factor to take into account is that traditional (not bigfile) datafiles are limited to 4M blocks. That's about 32G for 8k blocks, but only 8G for 2k blocks. Since the number of files per (non-bigfile) tablespace is limited (to 1022), this could be important. (There is also a maximum of 65533 files per database, possibly less. See physical database limits.)

Transactional systems tend to use relatively smaller block sizes than datawarehousing systems, but that really isn't anything like a hard and fast rule. (e.g. current Oracle best practice guidelines on Exadata is 8k blocks regardless of scenarios.).

Related Solutions

Mysql – Is MySQL Replication Affected by a High-Latency Interconnect

The direct answer to your question is Yes, but it depends on the version of MySQL you are running. Before MySQL 5.5, replication would operate as follows:

Master Executes SQL
Master Records SQL Event in its Binary Logs
Slave Reads SQL Event from Master Binary Logs
Slave Stores SQL Event in its Relay Logs via I/O Thread
Slave Reads Next SQL Event From Relay Log via SQL Thread
Slave Executes SQL
Slave Acknowledges Master of the Complete Execution of the SQL Event

As of MySQL 5.5, using Semisynchronous Replication, now replication would operate as follows:

Master Executes SQL
Master Records SQL Event in its Binary Logs
Slave Reads SQL Event from Master Binary Logs
Slave Acknowledges Master of the Receipt of the SQL Event
Slave Stores SQL Event in its Relay Logs via I/O Thread
Slave Reads Next SQL Event From Relay Log via SQL Thread
Slave Executes SQL
Slave Acknowledges Master of the Complete Execution of the SQL Event

This new paradigm will permit a Slave to be closer sync'd to its Master.

Notwithstanding, latency within the network could hamper MySQL Semisync Replication to the point where it reverts back to the old-style asynchronous replication. Why ? If a timeout occurs without any slave having acknowledged the transaction, the master reverts to asynchronous replication. When at least one semisynchronous slave catches up, the master returns to semisynchronous replication.

UPDATE 2011-08-08 14:22 EDT

The configuration of MySQL 5.5 Semisynchronous Replication is straightforward

Step 1) Add these four(4) lines to /etc/my.cnf

[mysqld]
plugin-dir=/usr/lib64/mysql/plugin
#rpl_semi_sync_master_enabled
#rpl_semi_sync_master_timeout=5000
#rpl_semi_sync_slave_enabled

Step 2) Restart MySQL

service mysql restart

Step 3) Run these commands in the MySQL client

INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
INSTALL PLUGIN rpl_semi_sync_slave  SONAME 'semisync_slave.so';

Step 4) Uncomment the three rpm_semi_sync options after the plugin-dir option

[mysqld]
plugin-dir=/usr/lib64/mysql/plugin
rpl_semi_sync_master_enabled
rpl_semi_sync_master_timeout=5000
rpl_semi_sync_slave_enabled

Step 5) Restart MySQL

service mysql restart

All Done !!! Now just setup MySQL Replication as usual.

MySQL database design

I'll take a crack at this. Note, that I've just listed fields that are necessary. If you have additional data you want to store for in those tables, feel free to add them (so long as they are normalized).

While you did state that you would use a different database for each region, I designed this so that they could all work in one database. Right now you may have only 5 regions, but this offers the flexibility to easily add more:

table: region
regionID        varchar(2)
regionName      varchar(50)

Defines data for each realm. I went with a surrogate key (realmID) because it's possible that realmName might not be unique. This assumes that each realm can only belong to one region (regionID needs to be a foreign key to regionID on the region table):

table: realm
realmID         int
regionID        varchar(2)
realmName       varchar(50)

Defines data for each user (seller/buyer). I decided that it would be better if it had a surrogate key (sellerID) because seller/buyers first/lastname combinations may not be totally unique. Note, you may also want an address table to store multiple:

table: user
userID          int
userFirstName   varchar(20)
userLastName    varchar(20)
userEmail       varchar(100)

Data for each auction. AuctionID would be a auto-generated ID and the Primary Key. You would also want foreign keys on realmID (realmID on realm table) and sellerID (userID on user table):

table: auction
auctionID       int
realmID         int
sellerID        int
auctionPrice    decimal
auctionStart    datetime

Data for each auction that has been completed with a foreign key on buyerID (userID on user table):

table: auctionHistory
auctionID       int
buyerID         int
finalPrice      decimal
auctionFinished datetime

I don't think that you'd need a seperate table for "Scan". That would be a simple query of rows in the auction table that do not have matching rows in the auctionHistory table. Hopefully, this will help you get started. If I missed/misread anything, please let me know.

Best Answer

Related Solutions

Mysql – Is MySQL Replication Affected by a High-Latency Interconnect

MySQL database design

Related Question