Using floats for PKs. Is it a good idea

database-designperformanceprimary-keyquery-performancesortingsqlite

Let's say I have an entity PizzaStore which has a location as part of its attributes
Now a PizzaStore can have N of other PizzaStore that are close by.
So seems to be a self referencing relationship 1-N (optional)
If we know before-hand which exact PizzaStore are close by each other what would be the best way to represent that?
Since we have a self-referencing 1-N relationship I think another table would be needed <pizza_store_id, other_pizza_store_close_by_id>
In this case we would store e.g.
<1, 22>, <1, 23>, <1, 78>, <2, 102> etc to show that pizza store with id 1 has 22, 23 and 78 near by etc

Now in order to get these rows back in-order I would need to create a PK and query based on that.
I was wondering would an auto-increment guarantee the insertion order?
Or would I need to use a float representing the distance e.g. <2.04, 1, 23> (where 2.04 is the distance in miles)
I was also thinking is there a better way than this?
We know that if 1 is close to 22 then 22 is also close to 1 right?
Is there a more efficient way to represent this information?
I think it would suffice to store just the row <1,22> to capture that 1 is close to 22 and that 22 is close to 1. But this way we are losing the order

Update:
All the answers are very useful. One thing though that perhaps is not clear.
I already have the order of the entities based on the distance on insert time.
I.e. I have a set of ids of objects already sorted based on distance and I want to insert this set in the database in a way that I can retrieve the rows back in the order I have inserted them. I want to avoid sorting on distance on retrieve since I already had the order at insert time. These ids would be used to join to another table (they are ids) and I want to retrieve the rows in the insert order

Best Answer

To answer your question directly: No. Using floats for a primary key is not a good idea. Why? Because floats can run into problems with rounding and precision (Is 1.20 the same as 1.2? Because mathematically it is.)

Personally I would create a primary key on a single field. Preferably an integer. Integers are typically more efficient than floats and there is no ambiguity about interpretation as with floats and strings.

The most important function of a primary key is to uniquely identify a single row in a table. It doesn't have to have any meaning or purpose beyond that. If it serves another purpose thats a bonus.

In your case, (as your title implies) I wouldn't use the distance as the primary key (on its own) because it may be possible that 2 combinations of stores are the same distance apart.

You could use it as part of a compound key. But in this example it doesn't make sense unless you wanted to store multiple distances between the same 2 stores.

To go a bit further: This looks to me like a spatial query issue. If you store an XY coordinate with each store, you can then do things like a radius search. "Give me a list of pizza shops with 1 mile, 2, 5 , 10 miles" etc. Or even better from a customer address you could determine which shop is closest.

If a new store is added it would automatically be included in the result set without having to manually calculate the distances. You could also calculate the distances between every shop.

The only shortcoming of this approach i can think of is that the distances are straight line and may not accurately reflect the distance you may have to travel between shops by road.

Depending upon what you are trying to achieve it may be useful to store the distance rather than recalculate it, but i personally prefer to derive values as needed.

Best Answer

Related Solutions

Mysql – Design for archiving a sorted list (with ability for a few INSERT)

SQL Server – Using Indexed Views for Aggregates: Too Good to Be True?

Related Question