ANALYSE([
      max_elements[,max_memory]])
        ANALYSE() is defined in the
        sql/sql_analyse.cc source file, which
        serves as an example of how to create a procedure for use with
        the PROCEDURE clause of
        SELECT statements.
        ANALYSE() is built in and is available by
        default; other procedures can be created using the format
        demonstrated in the source file.
      
        ANALYSE() examines the result from a query
        and returns an analysis of the results that suggests optimal
        data types for each column that may help reduce table sizes. To
        obtain this analysis, append PROCEDURE
        ANALYSE to the end of a
        SELECT statement:
      
SELECT ... FROM ... WHERE ... PROCEDURE ANALYSE([max_elements,[max_memory]])
For example:
SELECT col1, col2 FROM table1 PROCEDURE ANALYSE(10, 2000);
        The results show some statistics for the values returned by the
        query, and propose an optimal data type for the columns. This
        can be helpful for checking your existing tables, or after
        importing new data. You may need to try different settings for
        the arguments so that PROCEDURE ANALYSE()
        does not suggest the ENUM data
        type when it is not appropriate.
      
The arguments are optional and are used as follows:
            max_elements (default 256) is the
            maximum number of distinct values that
            ANALYSE() notices per column. This is
            used by ANALYSE() to check whether the
            optimal data type should be of type
            ENUM; if there are more than
            max_elements distinct values,
            then ENUM is not a suggested
            type.
          
            max_memory (default 8192) is the
            maximum amount of memory that ANALYSE()
            should allocate per column while trying to find all distinct
            values.
          


User Comments
I did some tests using a table with 1000000 rows and this function PROCEDURE ANALYSE() returned all values in ENUM data type.
mysql> SELECT id, ativada, cumprida FROM t1 PROCEDURE ANALYSE(1000000,256)\G
I think you may be misunderstanding the syntax here. Let's say we've got a table called charac which has five characters in it:
5 rows in set (0.00 sec)mysql> select * from charac;
If we select * from charac show procedure(), we're passing the default values, so we'll get everything back as enum:
mysql> select * from charac procedure analyse()\G
*************************** 1. row ***************************
Field_name: world.charac.charac
Min_value: A
Max_value: E
Min_length: 1
Max_length: 1
Empties_or_zeros: 0
Nulls: 0
Avg_value_or_avg_length: 1.0000
Std: NULL
Optimal_fieldtype: ENUM('A','B','C','D','E') NOT NULL
1 row in set (0.00 sec)
The first argument refers to the number of elements, and the next argument refers to the total memory assigned. So, if we do this:
mysql> select * from charac procedure analyse(5,24)\G
*************************** 1. row ***************************
Field_name: world.charac.charac
Min_value: A
Max_value: E
Min_length: 1
Max_length: 1
Empties_or_zeros: 0
Nulls: 0
Avg_value_or_avg_length: 1.0000
Std: NULL
Optimal_fieldtype: CHAR(1) NOT NULL
Then it's suggested CHAR(1) for the field which is perhaps more applicable. Hope this helps.
Bug #44060: First option of PROCEDURE ANALYSE() does not work, second needs some work
[15 Apr 2009 5:13] Roel Van de Paar
< PARTIAL WORKAROUND >
In regards the issue with 'ENUM column recommendation output' for PROCEDURE ANALYSE, you
can still 'parly' use this function based on the second argument only.
For instance, if you would like to have a maximum of 50 characters (excluding 'NOT NULL')
for any ENUM column declaration, use the function as follows:
PROCEDURE ANALYSE(1,50);
The '1' will not do anything (as per the bug), and the '50' will define the maximum
numbers of characters for any ENUM (excluding the text 'NOT NULL', as per the bug).
If you do not want to use any ENUM columns at all (and for instance use a linked lookup
table with IDs instead), you can use:
PROCEDURE ANALYSE(1,1);
Having a linked lookup table, allows you the advantage of being able to add new values to
the lookup table later on, and then start inserting the new IDs into the main table
immediately (i.e. no ALTER of the ENUM column is required).
Add your own comment.