SQL Performance and Generating a Unique ID
Hello Aaron -
Thank you for the post it was something great to look into and to see. I have a different method that worked for me and my group, that was similar to your first example and something that “we” on 5/17/2016 had asked about. I adjusted the table that I had to have a “uniqueidentifier” as a field and then to have a calculated column based on that field to give me a 10 digit ID, which would be 9.99 billion (possibly what “we” was asking for.
CREATE TABLE [dbo].[tblClubID](
[uidNewID] [uniqueidentifier] NOT NULL,
[intClubID] AS (abs(checksum([uidNewID])))
)
Then I had a Stored Procedure for the entry/insert into the table. Instead of using the “NOT EXISTS” or “EXISTS” clauses /functions, I just did a search in the table looking for that ID already with an equal and put that, plus the variable creation, inside a while loop. So if the number / ID exists, it would loop through again and try to create a new number / id.
Performance wise, I did not see any slowdown as we were adding 50k ID’s and there was already 500k in there. I am sorry I don’t have the tools handy for this example that Aaron had, but I am interested. I would assume that after about 500 million records in there it would slow down (and leading up to that as well), as the odds of creating a number that already exists would be approaching 50%, causing another loop or more. We had initial plans for 250k, knowing that one day we might need a full million, and we are already there, so needing 10 million is something that was on the radar for us.
The requirement we also had was that when someone was making a new entry, they would also be returned what their ID was right away. That is the extra “OUTPUT” line in the middle of the INSERT AND VALUE clauses. One other thing to note with the OUTPUT, I have the calculated field stored as an INT but the OUTPUT is a 10 character nvarchar. There is a view that also does this conversion, as well as another lookup stored proc. I noticed that when the ID had the CAST and then the 0-padding and then the RIGHT operators all on it, it really slowed things down just for the INSERT, which is the part that I really cared about speed. I think (and I could be wrong), I would have to perform that operation on the CHECKSUM result and then the comparison became really slow. I think that the speed of comparison for integers is faster than that of strings (I don’t have anything handy to back that up, but please let me know). If there is a person that needs a single entry, its already going to be quick, so I tried to put the “formatting” functions on the value at the time when there would need to be a format visible (ie - the OUTPUT).
The Stored Procedure below….
CREATE PROCEDURE [dbo].[sprocClubInsertNewMember]
-- procedure variables
-- data associated with the club id
AS
BEGIN
DECLARE @NewID UNIQUEIDENTIFIER;
DECLARE @newCKS INT;
DECLARE @noDups int = 1;
WHILE @noDups >= 1
BEGIN
SET @NewID = NEWID();
SET @newCKS = (abs(checksum(@NewID)));
SET @noDups = COALESCE((SELECT intClubID FROM myClubTable WHERE intClubID = @newCKS),0)
END
INSERT INTO myClubTable (..., ..., ..., ...)
OUTPUT RIGHT('000000000' + CAST(Inserted.intClubID AS VARCHAR(10)),10)
VALUES (..., ..., ..., ...)
END
GO