twitter: @clif_high for updates. Youtube channel clif high


Welcome to halfpasthuman - a website helping you get the most from the future in your day today...using the art and science of predictive linguistics.


About Predictive Linguistics and our methods

Predictive Linguistics is the process of using computer software to aggregate vast amounts of written text from the internet by categories delineated by emotional content of the words and using the result to make forecasts based on the emotional 'tone' changes within the larger population. A form of 'collective sub-conscious expression' is a good way to think of it. Predictive linguistics can be used to forecast trends at many different levels, from the detail of sales to individuals, all the way up to forecasts about emerging global population trends.

It is this last that concerns us here at

We invented the 'emotive reduction algorithm(s)' employed in 1997, as well as much of the emerging science behind deep data mining for emotional content over these past decades.

Predictive Linguistics uses emotional qualifiers and quantifiers, expressed as numeric values, for each and all words/phrases discovered/filtered in the aggregation process. Over 80 % of all the words gathered will be discarded for one or more reasons.

Predictive Linguistics works as NO conscious expressions are processed through the software. Rather the contexts discussed within the report in the form of entities and linguistic structures (see below) are read up in the various intake software programs, and the emotional sums of the language found at that time are retrieved. Words that are identified within my system as 'descriptors' are passed through the processing as well. These descriptor words, in the main, are those words and phrases that provide us with the detail sets within the larger context sets.

As an example, the word 'prophecy' may be read up by our software at a sports oriented forum. In that case, perhaps, due to the emotional sums around the context, and the emotional values of the word itself within the lexicon, it would be put into the contextual 'bin' within the database as a 'detail word'. Note that the context of the use of the word in the sports forum is lost in the process and is of no use to us in these circumstances. What occurs is that the word is picked up as being atypical in its context, therefore of high potential 'leakage of future' value. The way this works is that most sports forum language about future events would be statistically more likely to use words such as 'bet' as in 'I bet this XXX will be outcome', or 'I predict', or 'I think that XXX will happen'. So it is the context plus emotional values plus rarity of use within the context that flags words for inclusion in the detail level of the data base. Further, it is worth noting that most detail level words are encountered in our processing mere days before their appearance. Within the IM data primarily, and then within ST data next. But a preponderance are discovered within the IM time period. Perhaps an artifact of our processing, if so, one not explored due to lack of time (cosmic joke noted).

Words are linked by their array values back to the lexicon using our set theory model (see below), and the language used within the interpretation (detail words excepted) derives from the lexicon and its links to the changing nature of contexts as they are represented within our model.

Predictive Linguistics is a field that I pioneered in 1993. The software and lexicon has been in continual change/update mode since. This is due to the constantly changing nature of language and human expression.

Predictive Linguistics works to predict future language about (perhaps) future events, due to the nature of humans. It is my operating assumption that all humans are psychic, though the vast majority do nothing to cultivate it as a skill, and are likely unaware of it within themselves. In spite of this, universe and human nature has it that they 'leak' prescient information out continuously in their choice of language. My software processing collects these leaks and aggregates them against a model of a timeline and that information is provided in this report.

The ALTA report is an interpretation of the asymmetric trends that are occurring even this instant as millions of humans are typing billions of words on the internet. The trends are provided in the form of a discussion of the larger collections of data (dubbed entities) down to the smallest aspect/attribute swept up from daily discussions within that context. Within the ALTA report format, detail words are provided as noted below. Phrases and idiomatic expressions are also provided as details. In the main, geographic references are merely summed, and if deemed pertinent, the largest bag in the collection is discussed as a 'probable', or 'possible' location to the events being referenced within the details.

In our discussions, the interpretation is provided in a nested, set theory (fuzzy logic) pattern.


Aspects/Attributes are: collections of data that are within our broader linguistic structures and are the 'supporting' sets that provide our insight into future developments. The Aspect/Attribute sets can be considered as the 'brought along' serendipitous future forecasts by way of links between words in these sets and the lexicon.

Entities are: the 'master sets' at the 'top' of our nested linguistic structures and contain all reference that center around the very broad labels that identify the entity: Markets, GlobalPop, and SpaceGoatFarts, as examples.

Lexicon is: at its core level, the lexicon is a digital dictionary of words in multiple languages/alphabets stripped of definitions other than such technical elements as 'parts of speech' identifiers.

The lexicon is quite large and is housed in a SQL database heavily populated with triggers and other executable code for maintenance and growth (human language expands continuously, so the lexicon must as well).

Conceptually, at the Prolog software engine processing level, the lexicon is a predicate assignment of a complex, multidimensional array of integers to 'labels', each of which is a word within the lexicon. The integers within the 8x8x10 level array structure are

composed of: emotional qualifiers which are assigned numeric representations of the intensity, duration, impact and other values of the emotional components given by humans to that word.

and also contain: emotional quantifiers which are assigned numeric representations of the degree of each of the 'cells' level of 'emotional assignment'.

Spyders are: Software programs, that once executed are self directing, within programmed limits, thus are called 'bots', and within these constraints are allowed to make choices as to linguistic trails to explore on the internet. The job of the spyders is to search, retrieve and pre-process (part of the exclusions process that will see 90% of all returned data eliminated from consideration in our model) the 'linguistic bytes' (2048 words/phrases in multibyte character format) which are aggregated into our modelspace when processing is complete.

List of entities explored in this report:


The GlobalPop entity represents the linguistic sets within the data that are focused on the future of humanity, local or global. The 'local future' focus language is aggregated into our 'global future language' sets. This entity is independent of language, alphabet, or script form, and thus is our deepest and broadest set for emotional quantifiers and qualifiers about humanity's future.

USAPop (and any other nation state/territorial reference)

All sub sets of the populace of the planet, within our modelspace are identified by either a geospatial term such as a regional terrestrial label, e.g. 'AlpinePop', or a geopolitical label, e.g. 'CanadaPop'. These are used to isolate the subset of the global populace to which the terms are being applied in the forecast. The terrestrial references are frequently used to provide a context of 'shared views/concerns', as in 'those things all mountain dwelling people will have in common separate from other humans'.


The Markets entity is a super set of linguistic structures covering paper debt markets of all kinds, commodity trading markets, physical swap markets, currency usage (within populace), digital currency developments, new technology (FinTech),


The Terra entity is the master set for all structures that relate to the planet, and the physical environment of planet earth. This master set frequently and increasingly has extensive cross links to the SpaceGoatFarts entity.


This master set is where all data that fits under the contexts of [officially denied], [unknown], and [speculative] arrives. Our processing discovered significant amounts of data of the [unknown], and [officially denied] over 2000 – 2003 which led to the creation of the separate entity view labeled SpaceGoatFarts. As may be expected, this set contains the references to UFO's, Area 51, Break-away Civilization, and other 'woo-woo' subjects.

Data Types

IM = Immediacy data with forecasting effectiveness from 3 days to the end of the third week. Error range is 4 weeks.

ST = Shorter Term data with forecasting effectiveness from the 4th week out through and inclusive the end of the 3rd month (from date of interpretation). Error range = 4 months.

LT = Longer Term data with forecasting effectiveness from the end of the 3rd month out through and inclusive of the end of the 19th month. Error range = 19 months.

Terms employed:

Cross links – links from one cell in the data base and its software representation to another due to a shared linguistic structure or pattern.

Linguistic structure – In my modelspace, a linguistic structure is a 'master set' and all its contained sub sets (also known as 'directly held' sub sets). At the very highest level, each and all entities within my model are linguistic structures; which, in their turn, are composed of many sub sets of other linguistic structures. Modelspace allows for 256 layers of 'nesting' of these sets and sub sets. Each of which, can and may, be a complex set of its own. Obviously the model is derived from Object Oriented Programming at its highest level.

Meta Data Layer – in modelspace, when a meme appears directly held in numerous sets, at the same level of support, it is labeled as a 'meta data layer'. These 'layers' can be thought of as a common linguistic structure that forms with differing supporting sets in the various entities. For clarity, a meme in Terra entity would not have the same supporting sets as that same meme in the GlobalPop entity, but both would be part of the larger meta data layer that the meme reveals.

Modelspace – in the interpretation, the data sets are represented on screen in a 'virtual box' fashion in which a 3d box is drawn and the lexicon linked words from the latest data processing are shown within the 3d box by position, and color, brightness and hue of the individual pixels. Using an algorithm of my own design and the predicate calculus of the prolog programming language, modelspace is populated by these data base representations in a manner that resembles 'scatter graphs', but at a 3 dimensional level. By toggling on or off several advanced features of our 3d box software, the various levels of data, and cross links and other technical elements may be displayed.

MOM – model of modelspace. In the very first public release of information from my process, a self-referencing loop was created by internet articles about the release, and thus the next time the spyders were invoked, the process crashed on self-referential, circuitous references to my own work. As a corrective measure, MOM (models of modelspace) was devised as my very improvement on the process. MOM holds a copy of my interpretation as well as links to areas on the net to exclude from consideration within the predictive linguistic work.

Set – Our approach involves the use of complex (fuzzy) set theory originating in the software industry's quest for 'intelligent machines' or 'ai (artificial intelligence)'. In our approach, the fuzzy sets are based on the ability to define such concepts as 'near', 'close to', 'about', and 'like' among many others which provide me the ability to assign a numeric representation as a 'quantifier' to human emotions which are the key element to future forecasting from predictive linguistics.

Temporal Echo – these are linguistic echos across time that will reference the same, previously forecast, meme and its emotional parameters. The language manifest in both instances, that is, the temporal (meme) anchor and its echo will be related to each other, though frequently the repeating echo is larger in both scope and intensity. In some cases the meme is 'completed' in our predictive linguistics sense of that word by the echo phase of the meme.

TM = Temporal Marker, think of this as a 'book mark' against which you may remember specific details of the forecast. These are chosen due to some (more or less) easily identified linguistic 'tell-tale' that we expect to show up in the forecast language within media discussions.