|
|
|
Gregory Crane |
|
Professor of Classics |
|
Winnick Family Chair of Technology and
Entrepreneurship |
|
Perseus Project |
|
Tufts University |
|
|
|
|
|
DL development 1987- |
|
Ancient Greco-Roman Culture |
|
DLI-2: “A Digital Library for the Hum.” |
|
Up through early 20th century |
|
Calculatedly disparate collections |
|
Production: www.perseus.tufts.edu |
|
9million pages/month, 85% Greco-Roman |
|
Research: what characterizes cultural DLs? |
|
Audience / Services / Content Model Triad |
|
Cross-over: e.g. NSDL work |
|
|
|
|
|
|
Why not “Computational Humanities,” “Humanities
Computing,” “Computing and the Humanities”? |
|
Too confining |
|
Textual and fine arts |
|
Associations with canonical culture, esp.
western |
|
Cultural Informatics -- very broad |
|
Challenging but important perspective |
|
|
|
|
|
|
Object of Study: |
|
Geo-spatial open: All cultures of the world |
|
Temporally open: cultures as evolving process |
|
Past, present and future |
|
Goals: |
|
Analysis of cultures |
|
Communication between cultures |
|
Fundamental to world peace and prosperity |
|
|
|
|
|
|
|
Two domains of study |
|
Knowledge as end product |
|
Wissenschaft /Germanic model |
|
Knowledge as means to an end … |
|
But what end? |
|
British tradition: Character formation |
|
Traditional culture: cultivation of
sensibilities |
|
|
|
|
|
|
Jerusalem |
|
Kosovo |
|
Baghdad |
|
Mecca |
|
Congo |
|
Rwanda |
|
|
|
|
|
|
Visualization: |
|
Tracking anger against the US |
|
(terrorism/national security) |
|
Identifying cultural trends |
|
(Marketing/trade) |
|
Broad educational |
|
Acquiring information, individual and
comparative |
|
|
|
|
|
|
|
|
|
|
|
F-Measures for Place Name Identification |
|
Includes semantic classification and
identification (Which Springfield) |
|
Greco-Roman Sources: 95% |
|
European Sources: 90% |
|
US Sources: 80%!!! |
|
|
|
|
|
|
|
|
|
|
|
Customized knowledge support |
|
What info do readers A vs. B need? |
|
Backgrounds, purposes etc. |
|
Documents: what am I reading? |
|
Objects: what is this thing? |
|
Spaces: where am I moving? |
|
Audiences |
|
Tourists and visitors |
|
Peace-keepers and ground forces |
|
Business |
|
|
|
|
|
|
|
Cambridge Civil War Monument (1870) |
|
Linking to other data |
|
City Directories |
|
Regimental Histories |
|
Period Maps |
|
Old Photographs |
|
|
|
|
|
Instantaneous responses shape experience |
|
Can you look up the point of a joke? |
|
Externalized and objectified knowledge |
|
FOUNDATION of modern society |
|
Internalized and personalized knowledge |
|
FOUNDATION of human consciousness |
|
|
|
|
|
Internalized, evolving, and dynamic |
|
Requires intensive knowledge acquisition |
|
Cf. Running, finger exercises, practice,
practice |
|
Enhances every element of society |
|
We see more where we look |
|
We hear more when we listen |
|
We feel more when we live |
|
|
|
|
|
Localized and personal |
|
Universal in human experience |
|
Deeply intertwined with self and identity |
|
Perceived as devalued in the West and US
especially |
|
This perception core security danger |
|
Source for much (most?) strategic anti-US
feeling |
|
|
|
|
|
|
|
Objective, scientific study of the processes
whereby we promote enhanced subjective, emotional experiences in the world |
|
Information --> knowledge --> wisdom |
|
Fundamentally interdisciplinary |
|
Complements humanities computing |
|
|
|
|
|
Moving through a neighborhood |
|
When were these houses built? What is their
style? Who lived here? |
|
Moving thru an ecosystem |
|
What are the plants/animals? |
|
What systems are in play? |
|
Answers to every quantifiable question delivered
in real time on the spot |
|
|
|
|
|
Continuation of reading revolution |
|
1760-1830, before and after |
|
Now requires a cultural informatics |
|
Includes but transcends textual materials |
|
What is the point of health and prosperity? |
|
Emerson’s American Scholar in the 21st century |
|
|
|
|
|
|
|
Quantitative data -- easiest |
|
States self-organize into databases (“Seeing
like a state”) |
|
Linguistic data -- hard |
|
Minimally dozens, if not hundreds |
|
Varying level of documentation |
|
Cultural data -- hardest |
|
Language/Culture clusters: thousands+ |
|
|
|
|
|
at the limits AND intersection of |
|
manual analytic techniques |
|
generic computational techniques |
|
Cross-trained experts |
|
Serve as connectors between specialists |
|
Have intuitive understanding of
not-yet-articulated possibilities from BOTH sides |
|
|
|
|
|
|
Aggregation and Visualization |
|
Extraction from many examples |
|
Quantified, targetted generalizations |
|
Focus and customization |
|
Start from a document/object/scene |
|
Customized decision support |
|
Yes/No decisions (~search) |
|
Discursive analysis (~browsing) |
|
|
|
|
|
Players -- no real specialists |
|
Faculty in higher education |
|
Librarians |
|
Think tanks |
|
Intelligence Community |
|
Broadcast media |
|
Journalists and professional authors |
|
|
|
|
|
Computing and the Humanities |
|
Focus on semi-passive analysis |
|
Emphasis on publication |
|
Social science & empirical data |
|
How well do we work with heterogeneous data? |
|
How well do we work with multiple languages? |
|
Computer and Information Science |
|
How far have we gone in document
understanding? |
|
Do we distinguish encyclopedic/semantic
data? |
|
|
|
|
|
|
|
|
Cultural Grant Agencies: IMLS, NEA, NEH |
|
Governmental libraries: LOC to public libs |
|
Governmental museum/sites: SI, NPS |
|
Intelligence agencies: CIA, NSA, etc. |
|
NSF: SBE, experiments with DLI, ITR |
|
|
|
|
|
|
Provide new kinds of training |
|
Cultural Informatics |
|
As self-standing discipline? |
|
As new specialty in History/Anthro/classics etc. |
|
As new specialty within Computer Science |
|
As logical extension of Lib and Info Science |
|
|
|
|
|
|
Core cultural informatics experts |
|
50? Able to coordinate many different efforts |
|
History/Information Science |
|
Domain Specific experts |
|
100s/1000s of experts in Area Studies/Lang Tech
etc. |
|
Build up to 100? Grad students/postdocs |
|
Research support: $50m/year? |
|
|
|
|
|
|
Create technological infrastructure |
|
Broaden/expand the evaluation forums |
|
More TREC/ACE/DUC/CLEF/SENSEVAL etc. |
|
Build knowledge resources |
|
Parallel corpora, lexica, portable heuristics |
|
Focus on broad semantic as well as encyclopedic
analysis |
|
Homo ignavus (lat.) ~ “bad man” but … |
|
|
|
|
|
First cut: 100 languages in five years |
|
Allow $1,000,000/language $100m |
|
US Knowledge Sources --> 1922 (Pub Dom) |
|
City directories, Census, |
|
Newspapers & Periodicals |
|
Encyclopedias, school texts, manuals |
|
Maps, gazetteers |
|
Allow avg $1,000,000/year @ 300 years: $300m |
|
|
|
|
|
|
|
|
World peace and prosperity are the goal |
|
What US agencies do what? |
|
IMLS, NEH, NEA, LOC, SI, NPS all have roles |
|
But much work must be situated in NSF |
|
Cultural informatics includes scientific and
engineering research |
|
NSF should, at the least, incubate these aspects
of cultural informatics |
|
|
|
|
Can dynamically plot cultural states across the
globe from dozens of language/culture combinations |
|
Can support reading/spatial exploration/object
analysis customized for many different categories of user |
|