Discuss Selecting pages

From semantic-mediawiki.org

Talk pages on this wiki should primarily be used to address possible mistakes as well as missing and superseded information in the documentation.
In case you are seeking support concerning individual questions, please have a look at this page. The Semantic MediaWiki user mailing list is always a good idea for seeking help.

Archive

Grouping ORs with ANDs in arrays

Wow, been spending all day trying to get this working correctly. I have a SemanticForms RunQuery with a form entry that asks for a location. I'd like for the ability to put in "Texas" or "Austin" or "Austin, Texas". The first two are easy, but having the entry work with an array of "Austin, Texas" is proving a nightmare. If I manually type in

{{#ask:[[City::Austin||Texas]] [[State::Austin||Texas]]}}

I get what I'm looking for. Austin is matched with City, Texas is matched with State, and only Austin locations are shown. If I try to use an array or variable like

{{#ask: [[City::{{#arraymap:{{{Entry|}}}|,|x|x|<no*wiki>||</no*wiki>}}]][[State::{{#arraymap:{{{Entry|}}}|,|x|x|<no*wiki>||</no*wiki>}}]]}}

I get an Error: the value was not understood. I feel like I've tried everything. I've even tried to do a vardefine before the ask, and then ask on the var. Any ideas on how I can get this to work? Thanks

22:39, 16 November 2012

It's probably the double pipes that don't work, even when enclosed within nowiki tags (btw, why the asterisk?). I'm not sure if I understand you correctly, but is the intended query equal to the following one?

{{#ask: [[City::Austin]] OR [[City::Texas]] [[State::Austin]] OR [[State::Texas]]
}}

In that case, this should do the trick, right?

{{#ask: {{#arraymap:{{{Entry|}}}|,|x|[[City::x]]|OR}} {{#arraymap:{{{Entry|}}}|,|x|[[State::x]]|OR}} 
}}
18:28, 30 April 2013

Define a template {{!}} as | and try to use the template as the separator in {{#arraymap:}}: {{!}}{{!}}.

18:55, 30 April 2013

Indeed:

AND {{#arraymap:{{{Entry|}}}|,|@@@@|[[City::@@@@]]| }}

OR [[City::{{#arraymap:{{{Entry|}}}|,|@@@@|@@@@|{{!}}{{!}} }}]]

18:12, 3 December 2014
 
 
 

Problem with selecting pages with long SMW Property Values and LIKE

First and foremost I know that searching pages with a semantic property that contains some text is only allowed for the first 70 characters or less, using the syntax [[Property_name::~*needle*]], however I would like to extend this functionality.


The example I will show you in this topic could not be reproduced on sandbox.semantic-mediawiki or similar wikis, probably due to different versions of the platform, I will give you my specifics:

MediaWiki 1.25.1

PHP 5.3.3 (apache2handler)

MySQL 5.1.73

Semantic MediaWiki 2.2.2

Semantic Result Formats 2.3


I will show you screenshots generated from my wiki/DB


Let's take 2 different pages, both part of the same category (Impresa), created with the same form etc. One, Very Short Page has a very short value for the Comment property, the other, Entity with a very long labeltest1234567890, has a very long value for the Comment property as shown in the pictures below

Very Short Page: http://imgur.com/aQBu4nY

The other one: http://imgur.com/TtTymSl


Now if I want to select these pages I could see that there is a common string in their comment, "This is a page", so i could think about doing an ask on [[Comment::~*is a page*]]

The result only includes Very Short Page because it's the only one with that substring in its first characters as shown here: http://imgur.com/Wd64Njs

Thanks to the debug result format I can see the SQL query in which the ask was translated, running the query directly on the database has the same result: http://imgur.com/rxTFi0r

But there is a problem with this: If I call [[Comment::~*c9b45d*]] I would expect to hit no results, as no pages in my wiki has that string in the Comment property, however I do hit the Entity with a very long... because its hashed content contains that string, this leads to a wrong result: http://imgur.com/ULYlQwz

While you might think this is a stupid search, in Italian there are words that can be formed with just letters from "a" to "f" such as "bebe", "caffe" etc. so it's possible to hit hashed properties while looking for something else entirely, these are false positive matches that I absolutely do not want.


The solution to both problems (false negatives while looking for "is a page" and false positives while looking for a gibberish hexadecimal string) is to look in the blob when it is not null and only look directly in the hash only if the blob is null, like Extension "Semantic Drilldown" does, such as: http://imgur.com/3wICFp0


I would like to know if it is possible to hotfix this functionality in the core SemanticMediaWiki implementation or why this is a bad idea, I would appreciate guidance on how to use this workaround for all ask queries myself if needed, or any other workaround that would get me the same result.

Thanks

16:37, 27 December 2016

As a hack/workaround I made the following modifications:

in extensions/SemanticMediaWiki/src/SQLStore/QueryEngine/QueryEngine.php -> private function getInstanceQueryResult( Query $query, $rootid ):

added near the beginning:


global $wgCustomImSearch;

if(preg_match("/([a-zA-Z][0-9]+).o_hash LIKE /", $this->querySegments[$rootid]->where) && $wgCustomImSearch) {

$this->querySegments[$rootid]->where = preg_replace("/([a-zA-Z][0-9]+).o_hash LIKE /", "(IF( \${1}.o_blob IS NULL , \${1}.o_hash, CONVERT( \${1}.o_blob USING utf8 ) )) LIKE ", $this->querySegments[$rootid]->where);

}


In extensions/SemanticMediaWiki/src/SQLStore/QueryEngine/QuerySegmentListResolver.php -> public function resolveForSegment( QuerySegment &$query ) -> case QuerySegment::Q_DISJUNCTION:

added a similar check to the subquery where(s) before they are translated in SQL, after sql = ; I added:

global $wgCustomImSearch;

if(preg_match("/([a-zA-Z][0-9]+).o_hash LIKE /", $subQuery->where) && $wgCustomImSearch) {

$subQuery->where = preg_replace("/([a-zA-Z][0-9]+).o_hash LIKE /", "(IF( \${1}.o_blob IS NULL , \${1}.o_hash, CONVERT( \${1}.o_blob USING utf8 ) )) LIKE ", $subQuery->where);

}

$wgCustomImSearch is a global variable that I adedd for now, defined in LocalSettings.php so that I can troubleshoot performance differences when this hack is active or not, but it can be avoided. So far I do not see any drawback for this hack and I have not found any other points in the code where it is needed. If you think this could be useful for the project, please let me know. If you know of any reason why this is a bad idea, once again, I'm all ears.

17:22, 28 December 2016

> Semantic MediaWiki 2.2.2 > Semantic Result Formats 2.3

Version 2.2.2 is no longer in active maintenance and therefore should not be avoided as reference when trying to modify code or cite code snippets.

> some text is only allowed for the first 70 characters or less, using the syntax ~*needle*, however I would like to extend this functionality.

To cover search in regards to longer text (or better text that is longer than the 70 char limit) or case insensitivity we have approached the problem differently in 2.5 [0] by relying on the full-text index for enabled dataTypes that are provided by the DB back-end.

> could not be reproduced on sandbox.semantic-mediawiki

The sandbox has the Fulltext search enabled where ~/!~ expressions do match conditions against the available full-text index (if you use format=debug then you should see that the SQL is generated using the MATCH ... AGAINST instead of LIKE).

> So far I do not see any drawback for this hack and I have not found any other points in the code where it is needed. If you think this could be useful for the project, please let me know. If you know of any reason why this is a bad idea

[0] cites several sources as to why using LIKE on a "long text" creates potential bottlenecks and is the reason why LIKE/NLIKE should only be considered on a limited VARCHAR field.

I think it is commendable that you dig into the code trying to find a workaround, unfortunately using an arbitrary `preg_replace` or `preg_match` approach is unlikely to be implemented due to SQL convention, support for different DB back-ends.

If you still think your approach should be examined more thoroughly then you may create a PR and run it against our test suite.

> only look directly in the hash only if the blob is null, like Extension:Semantic Drilldown

We consider this the wrong approach and may break on any schema change we apply to the SQLStore in future.

> I would like to know if it is possible to hotfix this functionality in the core SemanticMediaWiki implementation or why this is a bad idea

We try to find implementations that are hopefully stable and provide features without impacting existing functionality or risking the introduction of instabilities.

PS: Using [1] is a better place to discuss technical (code) changes.

[0] https://github.com/SemanticMediaWiki/SemanticMediaWiki/pull/1481

[1] https://github.com/SemanticMediaWiki/SemanticMediaWiki/issues

03:09, 31 December 2016
 
 

Selecting pages with non alphanumeric values for their Label

Hey, we're currently having this problem: on our SMW all pages created with a form automatically get a Semantic Property "Label" that has the same value as the page name, so that we can search for it with #ask by querying on XYZ. However, if the Pagename, or really any other Semantic Property, contains non alphanumeric characters such as ' " & the #ask fails (even in the Special:Ask page). I tried escaping or double-escaping the characters with backslashes but that does not work. If it can help, the property datatype is set to String and does not generate warnings with those "unsearchable" values.

09:54, 7 October 2016

> currently having this problem: on our SMW all pages created with a form automatically get a Semantic Property "Label" that has the same value as the page name, so that we can search for it with #ask by querying on XYZ > alphanumeric characters such as ' " & the #ask fails (even in the Special:Ask page). I tried escaping or double-escaping

When reporting an issue you should always state the version (MW, SMW, SF etc.) you referring to. Aside from the unclear version, I'm a little bit unsure about the problem itself and the description is bit too ambiguous to pinpoint to a possible cause. I'd suggest you try to recreate your problem on [1] so that interested parties or developers can clearly understand the circumstances of your problem and have a reference to what your are referring to as `... contains non alphanumeric characters such as ' " & the #ask fails ...`.

[1] http://sandbox.semantic-mediawiki.org

03:35, 9 October 2016

Hey James, sorry for the confusion, what I mean is something along these lines:

I have a page with a Semantic Property that has a value that contains characters such as single quotes, double quotes or "&", such as the property Has text of this page http://sandbox.semantic-mediawiki.org/wiki/TestSelectingPages

If I now run the queries that I setup in this page http://sandbox.semantic-mediawiki.org/wiki/TestSelectingPagesQuery on my wiki, the only one that works is the second. The first and the third one show no results.

My wiki runs on MW 1.25.1 SMW 2.2.2 Semantic Forms 3.4 I tried to update SMW to 2.4.x on a local VM and the issue persists. The language of the wiki is set to Italian, if it can matter at all

Thanks for your patience

09:47, 10 October 2016
 
 

Filtering deleted pages seems not to work with SQLStore3 as well

Similar to the problems already mentioned with the old store, I ran in problems with deleted pages showing up as result with SMW1.8 on MW 1.20 using SQLStore3. I consider redirects to be part of the problem as well. As there is no bug referenced in the other thread on this page: Has the problem really been properly adressed and solved yet?

Regards

Argi

09:29, 27 February 2013

See bug 42896 which is still open.

01:13, 28 February 2013

Has this problem been solved in the meantime? The problem with deleted pages showing up as result also occurs with SMW 2.0 / MW 1.23.8

17:12, 8 January 2015

I'm not aware that this issue still persists in 2.0. If you can a provide a producible scenario and post it on [0] with reference to this thread (together with the versions used MW/SMW/+ other SMW related extensions). [1] is related to the SPARQLStore and was fixed in SMW 2.1.

[0] https://github.com/SemanticMediaWiki/SemanticMediaWiki/issues

[1] https://github.com/SemanticMediaWiki/SemanticMediaWiki/issues/566

18:29, 8 January 2015

Thank you very much for looking into my test wiki and for your analysis. Since I think the answer you sent me might be interesting for other people as well, I post an extract of your mail here:

MWJames mailed on 9 January 2015 at 18:18: Your ask query actually means "give me all pages in namespace Test" because Test: is registered as separate NS.

{{#ask: [[Test:+]]
| ?attribute1
| ?attribute2
}}

Since it is an unconditional query, it returns all possible pages even those deleted because the SQLStore doesn't delete a subject entry it sets it to be empty [0] leaving the subject (Page3/Page4) to remain in the smw_object_ids table. See also note [1], which states that "TODO: Possibly delete ID here (at least for non-properties/categories, if not used in any place in rels2)"

I added a simple condition to the query and got rid of the empty pages in the result:

{{#ask: [[Test:+]] [[attribute1::+]]
| ?attribute1
| ?attribute2
}}

MWJames, thank you very much for your fast reply and your analysis.

19:44, 9 January 2015
 
 
 
 

Excluding property/category/specific page?

May I ask if there's a way to exclude a specific page/property/category from a search or #Concept? I see here that we cannot use the ! exclamation point to exclude properties, but I wonder if there is another way?

For example, I have a #Concept with {{[[Category1]] [[Property1]] [[Property2]]}}, but don't want a child Category (of Category 1) to be included. I can exclude it by a number of attributes, including the sub-category, properties, or specific page names. Is there a way to do this?

Worse comes to worst, can I go to those several pages I want excluded and add a new property so it is excluded in the search? For example, add a show_on_search::false, but without having to go into all the other pages (those to be included) and add show_on_search::true on them?

Thanks

04:37, 8 November 2014

Category or page cannot be excluded. A certain value of property can: [[Property::!Value]] will yield all pages where Property is set but not to Value.

A workaround is using a template with {{#if:...}}. Or Lua.

07:15, 11 November 2014
 

Non-existing pages

Am I correct in saying that this page (and related) doesn't say how to to make queries list (only) non-existing pages i.e. red links, nor whether it's impossible? Or is it implied somewhere I'm not seeing? Perhaps it's implied somewhere that a page must exist for a semantic property to be declared about it?

Example: I want to make a list like http://wikipapers.referata.com/wiki/List_of_authors , but for pages which don't exist.

19:24, 2 November 2014

No, it is not possible to query for non-existing information. Setting a boolean property on existing pages for links which point to non-existing pages using the #ifexist parser function may be a way out. Thus you can query for this information.

10:37, 3 November 2014

Ok, thanks. Now, where to mention this assumption of the docs? The answer came to me when I thought a while about it, but it's not so obvious.

22:14, 7 November 2014
 
 

Semantic Maps

Hi All,
I am a newbie in Mediawiki.
I have a property Country with all countries as value.
And i have Project form having Counties field (Multi-select) [i.e ABC project is on 2 Countries (India and China)]
Now, for example
ABC => India, China
DEF => China, Japan
PQR => India, France

I want to build a map showing markers based on Countries [Property]
For Example,
India marker should show "ABC, PQR"
China marker should show "ABC, DEF"
Japan marker should show "DEF"
France marker should show "PQR"

Please anybody help to build the query and show the map of Projects based on Countries

12:36, 26 September 2014

Case insensitive query possible?

I have a search RunQuery that I would like to be case-insensitive. Is this possible? It seems the Like function used to be case insensitive, but isn't anymore. My query looks like this:

{{#ask:{{#if:{{{City|}}}|[[City::{{{City|}}}]]}}
|format=table
}}
16:56, 5 November 2012

I haven't found any configuration settings or query settings to make a query case insensitive. I think you have to do a feature request in the mailing lists.

11:52, 10 November 2012
 

Check your database collation.

11:56, 17 November 2012

I was thinking about changing the collation, but I know very little about mysql. In phpmyadmin I found a table called smw_ids, but not sure if this is where to make the change. Clicking on structure, there was no collation shown on any of the rows. I tried changing the title and sort_key to utf8-general-ci, but it still didn't show anything set for collation in these rows. Anyone know where/how to make the change? Thanks

20:55, 19 November 2012

This is a long-standing feature request, but yes, the answer tends to be "change your collation". Unfortunately, no one has been able to tell how to do so in a safe way (e.g. in phpmyadmin).

18:42, 30 April 2013

I've managed to get case insensitive ask queries in MW 1.21.1, SMW 1.8.0.5, by setting the character set and collation, and changing 3 columns from varbinary to varchar.

  1. During the database step in the mediawiki installation, choose UTF8 as the database character set. (If binary was chosen the DB needs to be recreated and pages migrated). Note that issues have been noted by other users with UTF8 and some languages, so test this thoroughly if using a language containing special characters.
  2. Backup database using mysqldump
  3. Open MySQL Workbench as the root (or other authorised) user
  4. Run the following script:
use [Your_Wiki_DB];

/* The following show the character set (expecting utf8) and the collation (expecting utf8_general_ci) of the database, respectively.  
 * Uncomment and run against the relevant database to show the default settings. 
 * Note that changing these does not affect existing tables. 
 */
--show variables like "character_set_database";
--show variables like "collation_database";

/* The following alter the relevant columns in the SMW tables to get case insensitive searching working */
-- smw_title
ALTER TABLE [Your_Wiki_DB].smw_object_ids change smw_title smw_title_bak varbinary(255);
ALTER TABLE [Your_Wiki_DB].smw_object_ids ADD smw_title VARCHAR(255) AFTER smw_namespace;
update [Your_Wiki_DB].smw_object_ids set smw_title = cast(smw_title_bak as CHAR) ;

-- smw_sort_key
ALTER TABLE [Your_Wiki_DB].smw_object_ids change smw_sortkey smw_sortkey_bak varbinary(255);
ALTER TABLE [Your_Wiki_DB].smw_object_ids ADD smw_sortkey VARCHAR(255) AFTER smw_subobject;
update [Your_Wiki_DB].smw_object_ids set smw_sortkey = cast(smw_sortkey_bak as CHAR) ;

-- smw_di_blob
ALTER TABLE [Your_Wiki_DB].smw_di_blob change o_hash o_hash_bak varbinary(255);
ALTER TABLE [Your_Wiki_DB].smw_di_blob ADD o_hash VARCHAR(255) AFTER o_blob;
update [Your_Wiki_DB].smw_di_blob set o_hash = cast(o_hash_bak as CHAR) ;
COMMIT;
5. Case insensitive searching should now work

Note that you may get an error with the UPDATE statements if your preferences in MySQL Workbench are set with "Safe Update" on. This can be turned off by going to Edit>Preferences...>SQL Queries tab. Then uncheck "Safe Updates"... and reconnect to the DB.

00:15, 10 October 2013

Note that the SQL table changes will be reverted if you upgrade your SMW version, so would need to be reapplied. Painful I know, but for us case insensitivity is mandatory.

22:09, 29 October 2013
 
 
 
 

You can try query like as workaround.. It won't work for vAlUe, but might help to solve it.

22:06, 3 April 2014
 

A workaround that does not require changing database collation is setting another property to capitalised value of the first one and querying it.

03:19, 18 August 2014
 

How does one "SUM" and "GROUP BY"?

Hi all,

I'm trying to select a set of data and do a "SUM" and "GROUP BY" on the result.

I have the following ask:

{{#ask:[[Category:Knowledge Transfer Log]]
|?Date
|?Total
|format=broadtable
|sort=Date
}}

Which results in this:

                Date            Total 
Item 0004 	29 April 2013 	5 	
Item 0005 	6 May 2013 	2 	
Item 0006 	10 May 2013 	3 	
Item 0007 	14 May 2013 	16 	
Item 0003 	14 May 2013 	17 	
Item 0008 	15 May 2013 	10 	

As you can see, Item 7 and Item 3 are both on the 14th May, however I'd like one record, summed, with a total of 33 instead of two separate records.

I've read up about the Sum format but that adds all rows together. I've also seen this trick of getting unique values, but it won't work with #ask.

Any help would be greatly appreciated.

09:10, 19 May 2013

Hi,

Has no-one solved this problem?

Cheers Walter

14:23, 8 August 2013

I'm afraid, there's no way to GROUP or SELECT DISTINCT. SMW selects pages (and internal objects). You can only use nested queries: first select dates (and you have to use Array extension or Lua to make selection distinct), format the selection with a template and include a query for Total into the template. Or use Lua from the beginning.

03:04, 18 August 2014
 
 

Like for urls

Hello. The like feature only works for String and Text according to the manual. Is there any way to do something like [[has url::~*blogspot*]] to get all pages which have url to Blogspot domain? Thanks.

08:43, 19 June 2014

A workaround is possible: if you set Has URL property by a template, you can also make the template extract the domain part from the URL and save it as another property, of String type, e.g. Has domain. It will also be faster.

02:59, 18 August 2014
 

Problem with Ask query Category:Actor

Hello, I am attempting to upgrade a MediaWiki (1.15.1) and SMW (Version 1.4.3) to the most current version of both MediaWiki and SMW. After the upgrade, everything is working fine except the ASK query with one colon (:), for example. I keep getting zero results when I run any Ask query with one colon. Any help?

22:09, 17 August 2014

What do you mean by "ASK query with one colon (:)"? Will you copy the whole query here?

02:56, 18 August 2014
 

How can I sort a table for current's date is displayed?

I tried putting this:


05:13, 26 March 2014

How to exclude the current page from the results of a query on the page

For specific are use cases it might be desirable to exclude exactly the one page that contains the query from the set of results of this query. This is currently not a feature of SMW but there are workarounds, for example:

  1. Install Extension:Semantic Extra Special Properties
  2. When following the installation instructions make sure you add the special property for the revision ID, for example by adding the following to LocalSettings.php:
    $sespSpecialProperties[] = '_REVID';// Add property with current revision ID
    
  3. Wait until Special:SMWAdmin is done
  4. Add [[Revision ID::!{{REVISIONID}}]] to the query string of the inline query
01:57, 27 February 2013

Selecting pages that have the same value set for a property

On the page for a movie (directed by a director) I would like to display a list of other movies directed by the same director. I.e. other pages in the category:Movies that have the same value set for the property:Directed by.

{{#ask: [[Directed by::{{#show: {{PAGENAME}}| ?Directed by}}]] [[Category:Movies]] |limit=10 |format=datatable }}

The above example produces a datatable with other movies directed by the same director, but it also outputs a warning (caused by the |-sign probably... WARNING: part "" of query not understood, may cause unexpected results...). Is there a better way for doing this?

Regards, Martin

22:33, 8 October 2012

Try ?Directed by#-.

But I think that querying itself is a bad idea. Better use [[Directed by::{{{Whatever_template_parametre_for_director}}}]].

02:45, 9 October 2012

Thanks

16:08, 13 October 2012
 

What do you mean by {{{Whatever_template_parametre_for_director}}}. Indeed, I have the same request but I don't know how to do.

I have a public proposal linked to Themes. For example, a proposal page :"Give an house to all people" with links to Themes "Human Rights" and "Economy" (property = Proposal-Themes) and I want to see all the other proposals related to the same Themes (in this case "Human Rights" and "Economy").

When I try to list with the primary "ask" as describe by Martinwiss, I get parsing errors like :

The part "|Human Rights" of the query was not understood. Results might not be as expected. The part "," of the query was not understood. Results might not be as expected. The part "|Economy" of the query was not understood. Results might not be as expected. The part "," of the query was not understood. Results might not be as expected. The part "]]" of the query was not understood. Results might not be as expected.

11:42, 2 February 2013

Of course, you have to process the parametre before passing it to a query, if it is a list or contains wiki markup. I do it with regular expressions.

15:40, 2 February 2013
 
 
 

Filtering based on the titles of the page

I have a number of pages whose titles are dates, and I would like to filter them to receive pages within a certain date range. The pages don't have any specific property set that is equal to the date in the title. Is there a way to do this with the SMW query language?

01:36, 25 January 2013

No there isn't.

06:41, 25 January 2013

Thanks for the reply. I've figured out that I could create a special property along the lines of the SemanticExtraSpecialProperties extension.

08:24, 25 January 2013