Subject | variable length strings |
---|---|
Author | Markus Ostenried |
Post date | 2006-12-21T16:56:07Z |
Hi all,
first of all let me say that I know about features more pressing and
needed, like SMP support, merging of architectures etc. I'm with all
the sensible people on this subject :)
A feature I would like to see for a very long time - in fact since I
started using Delphi 2 and InterBase - is a character data type of
variable length. I don't think of Varchar(n) here because it limits
you to n characters.
IMO it makes sense to be able to limit the length of a character
column but I'd rather do it with a constraint. The reason is
constraints can easily be changed, e.g. to loosen a size restriction.
I personally don't see any sense in restricting a last name to let's
say 60 characters. This is an arbitrary boundary that everyone in the
SQL world uses because it's enforced by the available data types. It
isn't necessarily based on the shape of data that one wants to store.
Another problem I encountered is with changing the maximum length. If
I encounter that some clients use computer names longer than my
assumed maximum of 15 characters then I have to change not only the
one table I use to store the computer names. I also have to change
every field that may get to contain this value through concatenation
to ensure all the fields are big enough to hold the now longer values.
(I know that's not a normalized structure in this special case). You
also have to adjust all variables in stored procs and triggers that
reference a Varchar(n) column that has been made bigger. This may get
better with usage of domains in stored procs and triggers. But with
usage of variable length strings there would be no change necessary to
that variables.
I think the current behavior is just an artifact that is quite
annoying in modern times. I don't see any performance benefits from
using Varchar(n) because the engine would still know about the maximum
length because of a constraint you can specify. Also internally run
length compression is used so I don't think the record layout would
change that much.
I don't know the engine internals but I fear changing everything to
use variable strings would be quite a big task.
I read some of Jim's slides about netfrastructure and falcon, and I
have to say that I really like the idea of having variable length
strings. To the outside and for compatibility everything looks the
same if you specify a length constraint - which the new implementation
of Varchar(n) would do internally. SQL standards aren't violated. But
if you want to then you can simply omit the length constraint and code
happily ever on.
I just wanted to bring this wonderful step into the present of
programming languages to your attention :) And I'm asking for your
opinions on this matter.
Please bear in mind that I'm a student not yet finished doing computer
science and that English isn't my native language. So if I didn't make
myself clear enough feel free to ask me what I'm really meaning.
Regards,
Markus
first of all let me say that I know about features more pressing and
needed, like SMP support, merging of architectures etc. I'm with all
the sensible people on this subject :)
A feature I would like to see for a very long time - in fact since I
started using Delphi 2 and InterBase - is a character data type of
variable length. I don't think of Varchar(n) here because it limits
you to n characters.
IMO it makes sense to be able to limit the length of a character
column but I'd rather do it with a constraint. The reason is
constraints can easily be changed, e.g. to loosen a size restriction.
I personally don't see any sense in restricting a last name to let's
say 60 characters. This is an arbitrary boundary that everyone in the
SQL world uses because it's enforced by the available data types. It
isn't necessarily based on the shape of data that one wants to store.
Another problem I encountered is with changing the maximum length. If
I encounter that some clients use computer names longer than my
assumed maximum of 15 characters then I have to change not only the
one table I use to store the computer names. I also have to change
every field that may get to contain this value through concatenation
to ensure all the fields are big enough to hold the now longer values.
(I know that's not a normalized structure in this special case). You
also have to adjust all variables in stored procs and triggers that
reference a Varchar(n) column that has been made bigger. This may get
better with usage of domains in stored procs and triggers. But with
usage of variable length strings there would be no change necessary to
that variables.
I think the current behavior is just an artifact that is quite
annoying in modern times. I don't see any performance benefits from
using Varchar(n) because the engine would still know about the maximum
length because of a constraint you can specify. Also internally run
length compression is used so I don't think the record layout would
change that much.
I don't know the engine internals but I fear changing everything to
use variable strings would be quite a big task.
I read some of Jim's slides about netfrastructure and falcon, and I
have to say that I really like the idea of having variable length
strings. To the outside and for compatibility everything looks the
same if you specify a length constraint - which the new implementation
of Varchar(n) would do internally. SQL standards aren't violated. But
if you want to then you can simply omit the length constraint and code
happily ever on.
I just wanted to bring this wonderful step into the present of
programming languages to your attention :) And I'm asking for your
opinions on this matter.
Please bear in mind that I'm a student not yet finished doing computer
science and that English isn't my native language. So if I didn't make
myself clear enough feel free to ask me what I'm really meaning.
Regards,
Markus